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ABSTRACT 

The 1969 Western Regional Conference on Testing 
Problems dealt with developments and assessments in educational 
centers and laboratories. The following speeches were presented: (1) 
••Behavioral Objective specifications in Evaluation: Relevant or 
Irrelevant" by Marvin c. .Alkin; (2) "Approaches to the Validation of 
Learning Hierarchies" by Margaret C. .Wang; (3) "Sogie Problems with 
Regard to Research and Development in Higher Education" by Leland L. 
Medsker; (4) "Educational Research, Educational Development and 
Evaluation Studies" by John K. Hemphill; and (5) "The Challenge of 
Multi-Agency Involvement in Development" by Ray Jongeward. .A list of 
conference participants concludes the report. . (KM) 
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Preface 



The theme for the 1969 Western Regional Conference oa Testing 
Problems was Devel(^pments and Assessments in Educational Cen- 
ters and Lab<»ratories* 

The program began with Marvin Allan discixssing whether or not 
the behavi(»al obj^^ive specifications in evaluation were relevant oc 
irrelevant Margaret Wang continued hy outlining the approaches 
pursued in the validation of learning hiararchies. Inland Medsker 
completed the morning session hy indicating some of the problems 
in research in high^ education. John Hemphill led the afternoon 
sessicm by directing his attention to educational research and devel* 
opnent in relaticm to evaluation studies followed by Ray J(Higeward 
vdio ckscnbed the challenge of multi-agency involvem^t in the de- 
vekq^ment of a prescribed acad^nic course for use in rural or de- 
prived areas. 

JuNHJS A. Davis, Chairman 
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Behavioral Objective Specnficsations 
in Evaluation: Relevant or Irrelevant? 

MARVIN C* ALKIN 

The question posed by die tide of this paper, The Use of Behavicval 
Ol^ectives in Evaluation: Relevant Irrelevant?^ is not readily an- 
swerable. Indeed, there is no single solution to die question* The 
use of specified behavioral objectives in evaluation is neith^ rele- 
vant nor irrele\^t It is die direefdd diesis of this paper that (1) 
behaviorally stated objectives are of relevance only to certain stages 
in die evaluation process; (2) even in diose stages M^iere it is rele\^t 
to state student behavic»al objectives, objectives spedficaticm alone 
ceases to be <rf singular significance with the increasing complexity 
of die program; and (3) even in relatively ncmcompkx programs 
within stages amenable to objectives spedfication, there is lit& re- 
search evidence showing whether evaluaticm using specified student 
behavioral objectives ""makes a difference.** 

The intent of diis papar, howev^, is not to discoifiit onnpletdy 
the value of spedfying objectives in the evaluation of instructional 
programs; to do so would be ludicrous. Alice and the Cheshire Cat 
probably said it best in Alice's Adventures in Wonderland: 
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"Come, it s pleased so far " thought Alice and she went oa "Would 

you tell» please, which way I oudit to go from here?" 

'TThat Spends a great deal on where vou want to get to," said the cat 

"I don't much care where • * ." said Auce. 

TThen, it doesn't matter which way you go," said the cat 

But behavioral objectives specification is not necessarily a pana- 
cea lor evaluation problems of all types. While all enterprises should 
have a goal, these goals are not necessarily always specifiable in 
student behavioral terms. I would submit also that the continually 
brc^dening definition of evduation has considerably modified views 
about the need for specification of behaviOTal objectives. 

The last two years have represented an exciting period in the field 
of evaluation. Indeed, it would not be an overstatement to mahitain 
that evaluation as a field has just begun to assume an identity of its 
own. I would agree with Egon Guba* that a major failing of evalua- 
tion today stems from the lack of an adequate definition. Past defini- 
tions have equated it widi either: (1) measurement and testing, (2) 
statements of congruence between performance and objectives, or 
(3) professional judgment None of these by itself is really an inclu- 
sive enough definition for the multiplicity of activities now regarded 
as evaluation. During the past year, a consensus has been develop- 
ing concerning a broader, more comprehensive definition of evalua- 
tion. This expanded view takes into consideration the decision- 
making functions, since an evaluation must be predicated <m, and 
adapted to, the specific problem or situaUon under analysis. 

In view of the fact that there is no definitive statement of evalua- 
tion, it would be inappropriate and inaccurate of me to present my 
definition as *'the" generally accepted one. However, in an effort to 
provide some framework for this paper, I will, somewhat hesitantly, 
step forward and present my definition of evaluation. Evaluation is 
the process of ascertaining the decisions to be made, selecting re- 
hied information, and collecting and analyzing that information in 
order to report summary data useful to decision makers in seJeciing 
among alternatives. 

The first part of the definition of evaluation pr^ented here deals 
with ascertaining the decisions to be made. Th*.* decision maker, not 
the evaluator, determines the questions to be asked or the decisions 
to be made. The task of the evaluator is to determine from the ded- 



* Egon Guba, Director, National Institute for the Study of Educationa] Oiange, 
IncUana University, Bk)omington, Indiana. 
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sion maker the elections for which information is required. The 
evaluator can and should, however, point out inconsistencies, poten- 
tial diflSculdes, or additional data that might modify the decision 
makers views on the relevance of certain decisions. 

The second task of evaluation deals with the specifications of re- 
quired information in light of the system s objectives. The specific 
nature of the information required will differ, of course, depending 
upon the kind of decision to be made. The task of the evalu ^or in 
specifying information requirements includes the development of the 
research design of the project, and the selection and/or development 
of instnunents designed to provide the information appropriate to 
the ded9 ^ns which must be made. 

Data ollection and analysis are tasks of prime concern to the 
evaluator The e\'aluator will encounter different problems associ- 
ated with these tasks depending upon the nature of die decisions to 
be made. 

One of the most vital parts of the evaluation process is reporting 
to the decision maker. Most evaluators often overlook this function, 
indeed, often consider it a merely pro forma exercise. If the purpose 
of evaluation is to provide information that will enable decision 
makers to form judgments about a program or about alternatives, 
then, the nature and form of the reporting should be appropriate to 
the problem and to the audience. 

STAGES OF EVALUATION 

This definition of evaluation carries with it a concern for the deci- 
sions to be made. Thus, if we are to understand the evaluation proc- 
ess, it is necessary to categorize educational decision situations. In 
this classification it would be necessary to examine the nature and 
kinds of decisions likely to require evaluative data. 

I have identified what I consider to be the five stages of an evalu- 
ation. Each is designed to provide and report information useful to 
a decision maker in making judgments. They are (1) systems assess- 
ment, (2) planning, (3) program implementation, (4) program im- 
provement, and (5) program certification. I should acknowledge that 
I have borrowed lil>erally in the development of these stages fi<Mn 
the work of Malcolm Provois* as well as Daniel Stufflebeam.t 

• Dr Malcolm Provus, Director, Board of Public E*3ucation, Pittsbu^, Penn- 
sylvania. 

t Professor Daniel L. Stufflebeam, Director, Evaluation Center, Ohio State 
University, Columbus, Ohio. 
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The first area in which evaluation might take place is in the assess- 
ment of needs. Needs assessment is a means of determining the edu- 
cational objectives most appropriate for a particular situation. The 
needs may be represented as the gap between the goal and the pres- 
ent state of affairs. Thus, the evaluation problem becomes one of 
assessing the needs of students, of the community and of society in 
relation to the existing situation. Needs assessment does not refer to 
specification of process characteristics appropriate for a district^ 
school, or classroom. The needs assessment must be related to the 
ultimate behavior of clients of one type or another (pupils, parents, 
community, etc., all are clients of the school). To put it simply, needs 
assessment must be a statement of objectives in nns of outputs 
rather than process characteristics of the system. 

No doubt it is obvious to you from these examples, as it is to all 
of those who have been engaged in needs assessment under a Title 
III program, that the process of deciding purposes of needs assess- 
ments, as well as specifying, collecting, analyzing, and reporting in- 
formation, is quite different from the methodology and techniques 
usually employed in typical evaluation. 

PLANNING 

The planning stage in evaluation is concerned with information 
which will enable the decision maker to select between alternative 
processes in order to make a judgment as to which process should 
be introduced into the sj^stem in order to fill most eflSciently the criti- 
cal needs which have been previously determined. After the decision 
maker receives die needs assessment evaluation, he might make a 
decision as to the appropriate means of fulfilling that need. Alterna- 
tively, the decision maker might designate several possibiliti^ and 
ask the evaluator to provide information on the possible impact of 
each. Thus, in the planning stage, the evaluator provides the data 
for an evaluation of a program prior to its inception. The task of the 
evaluator is to look forward to the attaiimient of goals and to deter- 
mine the likely goal achievement or outcomes. To repeat this in yet 
another way, the purpose of an evaluation in the planning stage is 
to assess the potentiid relative effectiveness of different coturses of 
action. 

It is quite obvious from this discussion that the collection and 
analysis of data of the type required for this evaluation stage will be 
quite different from collection and analysis problems for other stages. 
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The techniques may require both internal and external evaluation 
criteria. (The most appropriate technique might be informed judg- 
ment or other so called soft data.) 

The next step in the evaluation process is determining the extent 
to which the program has been implemented in the manner in which 
it was described in the design. (A part of the information specifica- 
tion, collection, analysis and reporting process is the specification of 
the design or procedures by which each of these activities will be 
accomplished.) 

In the case of an existing program, where no known changes have 
been implemented, the evaluation task for this stage is to determine 
the degree to which planning descriptions of the program coincide 
with the actual program and planning descriptions of the students 
and the context coincide with the actual students and context. 

PROGRAM IMPROVEMENT 

The evahiator can assimie a leadership role in program improve- 
ment by providing as much information as possible about the rela- 
tive success of its parts. In order to perform program improvement 
evaluation, it is necessary to recognize the basically interventionist 
role that the evaluator has been asked to play. As the evaluator iden- 
tifies problems and collects and analyzes information, data are pre- 
sented inunediately to the decision maker in order that changes to 
improve the operation of the program may be executed within the 
system. This stage of evaluation has often been overlooked or ig- 
nored by the traditional evaluator who has attempted to reproduce 
the antiseptic sterility of a laboratory in the real world. This approach 
may make a fine experiment, but it does little to improve a program 
which is often not in its final form. 

PROGRAM CERTIFICATION 

Finally, evaluation must provide information to the decision maker 
that will enable him to make judgments about the instructional pro- 
gram as a whole. This is the ''audit" stage of evaluation. The evalu- 
ator might attempt to provide information which will enable the 
decision maker to determine whether the program should be elim- 
inated, modified, retained or expanded. 

In this stage, the need for valid and reliable data would generally 
mandate that die evaluator attempt to apply as rigid a set of controls 
as possible. The evaluator might use pre and post test designs and 
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employ sophisticated statistical techniques for analyzing the data 
whenever possible. Intervention should be avoided in this stage. 

USE OF STUDENT BEHAVIORAL OBJECTIVES 
IN VARIOUS EVALUATION STAGES 

I will discuss each of these stages and the categories of decisions 
related to these stages in order to demonstrate the relevance or lack 
of relevance of student behavioral objectives for each. A decision is 
associated with each of tlie five stages and it is the job of the evalu- 
ator to provide the information that will assist the decision maker in 
selecting between alternatives for that decision. The nature of the 
decision at each stage, I believe, will demonstrate that information 
on the achievement of student behavioral objectives is not relevant 
to some stages and is not the only source of information appropriate 
for other stages. 

In the discussion that follows I do not mean to imply that the 
evaluator will necessarily participate in each stage of the evaluation. 
In some instances prior decisions may already have been made and 
the evaluator may be asked, simply, to provide information for suc- 
ceeding stages. In other instances the nature of the information to 
be collected may be relatively simple and the process of information 
selection, collection and analysis may be internalized by the decision 
maker and his staflF. However, for the sake of clarity, we will assume 
a hypothetical situation where the evaluator is asked to provide in- 
formation for decisions at each of the five stages. 

The first question facing the evaluator is related to selection of 
objectives for the system or modification of existing objectives. Thus, 
depending upon the situation, the decision maker may want infor- 
mation on whether various constituent bodies (i.e., the community) 
concur with the existing objectives of the system and what changes 
are needed. It may be appropriate to present information on the 
potential relevance of alternative objectives in terms of possible fu- 
ture significance. 

In a hypothetical situation, a school principal might be faced with 
budgetary decisions and want to get some insight as to how best to 
spend money in an incremental budget. He is anxious to spend this 
in a manner that is likely to be most beneficial to the school in terms 
of its needs. The evaluator has been asked to provide infcnrmation 
about various pcwsible objectives for the system, including some 
presently stated objectives which may be inadequately met 
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Thus, the evaluator may inform the decision maker th 
ber of behavioral objectives of the system have been rt . 

highly relevant by the commimity and that the evidence appears to 
demonstrate that these have been inadequately met. High on this 
list might be the students' inability to defend themselves against at- 
tacks by other students, i.e., trained in "the art of self-defense." The 
evaluator might also provide information, which would indicate the 
potential value of selecting "self-defense" as an objective of the 
system. Thus, the needs assessment evaluation would pro^dde the 
decision maker with information that would assist him in selecting 
between alternative objectives. The information is provided by the 
evaluator, but the relative weighting of the alternatives must be 
made by the decision maker. 

It is obvious from this example that the major source of informa- 
tion provided by the evaluator in this stage is related to students' 
behavioral objectives for the system. In essence the evaluator pro- 
vides altemf ive objectives along with other descriptive information 
to the decision maker. The student behavioral objectives are of great 
relevance in this stage of evaluation. 

In the planning stage, the evaluator provides information about 
possible means of achieving the objectives. The question asked by 
the decision maker is "What process is to be chosen from among a 
list of alternatives?" The evaluator is not an instructional develop- 
ment expert and ordinarily should not assiune the job of developing 
a program appropriate to the stated objective. However, the decision 
maker might have narrowed his choice to several alternatives and 
would like additional information on each of these alternatives. 

In the case previously presented, if we assume that the decision 
maker has selected a behavioral objective related to self-defense in- 
struction and has considered three alternative processes, then the 
evaluator might provide information related to each of these proc- 
esses. The information of necessity will be limited in this pre-imple- 
mentation stage. The evaluator will examine each of the processes in 
terms of various internal criteria, such as the extent to which the 
materials purport to achieve the specified objective, the clarity of 
the materials and the cost of the materials. 

In addition, the evaluator may invoke certain external criteria. An 
examination might be made of the literature related to the use of 
this process to determine the extent to which it had been found to 
be successful in similar situations. In the absence of any evidence 
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related to the use of these materials, the evaluator might choose to use 
^stematically sampled expert judgment about the potential worth 
of each of the processes considered. Thus, given the information col- 
lected and analyzed for this stage, the decision maker would be in 
a position to make a more rational choice. While it is true tliat the 
processes are examined in relation to potentially desired student 
outputs, the main source of information for the second stage of eval- 
uation is not information on student behavioral objectives but on 
processes. 

The evaluation related to the third stage, prograin implementation, 
has as its purpose providing information on whether the process 
which was selected has been implemented according to plan and 
whether the context of the situation in terms of the fixed attributes 
of the program have been described property in the planning stage. 
That is: Did the equipment arrive on time? or does the description 
of the students in the planning stage, whidi was considered at the 
time when the process was selected, correspond with the actual sit- 
uation? It is obvious that in this stage, also, specification of student 
behavioral objectives is not of critical importance. 

In the example that we have been using, let us assume that the 
decision maker has examined the alternative processes and has de- 
cided to introduce a course in shotgun manufacturing to achieve 
the objective related to "self-defense." One question for the evalu- 
ator is: Did the gun barrels arrive on time? 

In the foiuth stage, program improvement, specified student be- 
havioral objectives are of major importance. In this stage, the evalu- 
ator is concerned with determining changes in students and observ- 
ing students' achievement on a regular basis in order to provide 
feedback to the decision maker which will be helpful to him in modi- 
fying the program. In addition to information related to the achieve- 
ment of students on certain objectiv5^ dimensions, the evaluator has 
as his function within this stage the prevision of information relating 
to the effect of the introduced process upon other processes of the 
system. Thus, in the example we have been using, the evaluator 
might note that while students seem to be doing very well in learn- 
ing to construct shotguns, there appear to be deleterious effects upon 
teacher-student relationships. Moreover, other students in the school 
may, for some reason or another, be afraid of those in the experi- - 
mental program. Finally, the evaluator may note that the general 
appearance of the school building has suffered. (The walls are pitted. 
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and many have large gaping holes in them.) Op the basis of this in- 
formation, the decision maker may choose to modify the program, 
expand it because of the surprisingly good results, or perhaps even 
delete it immediately. 

Let us assume that the program has been allowed to continue and 
has gone through the program improvement stage to the point where 
the decision maker is now satisfied with the program and wants to 
provide a rigid empirical test. At this juncture, the evaluator may 
be called upon to provide an evaluation related to the program cer- 
tification function. The evaluator is not being asked to certify the 
program, but rather to provide information that will allow a deci- 
sion to be made about certification. As opposed to the previous 
stages, the role of the evaluator in the area of program certification 
is noninterventionist. Thus, in the example noted above, the evalu- 
ator will attempt to provide information on the decision to the deci- 
sion maker on the final (or nearly final) outputs of the system in 
student or other terns as a function of the course in shotgun manu- 
facturing. Again, student behavioral objectives should be considered. 
The evaluator will also want to provide information on the extent to 
which students are now better able to defend themselves. There are, 
however, a number of other outcomes of the systems that were per- 
haps not anticipated which might well be reported to the decision 
maker as part of the program certification evaluation. For example, 
he might note that there has been a considerable increase in the 
amount of violence in the community and an increase in the^umber 
of armed robberies. 

I have attempted to demonstrate in the preceding paragraphs that 
behavioral objectives are of considerable relevance to various stages 
of evaluation, are of relevance along with other lands of information 
in several stages of evaluation, and of little relevance and, indeed 
perhaps irrelevant in other stages. 

In areas traditionally conceived of as evaluation (i.e., program im- 
provement and program certification), there is ordinarily a great 
need for specifying objectives in behavioral terms as we have just 
pointed out. But even here I must sound a dissident note. Those ad- 
vocating the use of behavioral objectives as the main basis for evalu- 
ation are usually conoeraed only with the individual student or, at 
most, with the classroom as the unit of analysis. The examination of 
more complex programs often makes it impossible to state behavioral 
objectives at the outset. One can think of broadscale educational sys- 
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terns with outcomes that are not clearly deBnable and where the 
process of specifying objectives is an iterative one. The complexi- 
ties of this kind of system are often so great that to speak of objec- 
tives in any concrete sense is to mask the real outputs of the system. 
The outcomes and consequences of all of the many interactions 
within a system are very great, and are often at considerable variance 
with the objectives of fhe system. 

Also, the nature of tiie context at this macro-level of complexity is 
of considerable significance. While we would maintain that, at tfie 
micro-level, the most important element in evaluation is the specifi- 
cation of objectives, in large educational systems the context or na- 
ture of the surroundings has tremendous impact on the outcomes of 
the system. The Coleman Report^ is just one example of a whole line 
of research which has tended to substantiate this thinking. 

Other difficulties in evaluating complex systems involve accurate 
specification of the instructional treatment. That is to say, often the 
instructional treatment is neither clean, easily identifiable nor re- 
producible. It is, instead, a vast array of complex, interactive ele- 
ments loosely called "instruction." 

Thus, we have shovm that what is required in this kind of evalua- 
tion is not simply a specification of objectives, but, rather, a total 
examination of a system, with all of the impUcations that derive from 
systems theory. A systems evaluation carries with it the necessity for 
specifying the inputs and outputs of the system, and the understand- 
ing that the process of evaluation must be an interactive one in which 
successive stages produce additional information. 

If we think of evaluation as being the process of selecting, collect- 
ing, analyzing and providing information for decision makers, then 
the' implications of the data requirements for the evaluation of com- 
plex educational systems are readily apparent. In addition to speci- 
fying the objectives of the system and the degree to which the system 
has met these objectives, data must also be provided on other out- 
comes (unanticipated outcomes, consequences), on the inputs, on 
accurate descriptions of the alternative processes used, and on the 
input-output relationships, especially as they relate to the factors 
which can be considered by the decision maker. 

An activity just beginning at the UCLA Research and Development 
Center is designed to provide answers about the appropriate informa- 
tion necessary for various decisions. The project, the School Evalua- 
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tion Project, is being directed by Stephen Klein* and myself and is 
attempting to develop an information system that will help school 
principals predict student outputs of their schools and make decisions 
about how to improve these outputs. 

The project is uniquely different from most sodopsychological de- 
scriptive studies of education in that the orientation focuses on the 
decisions made bv school principals. The project wiU attempt to de- 
termine information requirements (that is, in terais of the deBnition 
of ^'appropriate evaluations") for each of a number of decisions or 
classes of decisions. It is hoped that the results of this research will 
provide insights into the relative importance of various kinds of in- 
fonnation, including those related to behavioral objectives, for vari- 
ous types of educational decisions. 

LACK OF RESEARCH EVIDENCE 

Finally, it is imperative to note that even for relatively discrete 
units of evaluation, there is no definitive evidence that behavioral 
objectives specification "makes a difference." It has not been sub- 
stantiated dearly that specifying objectives in behavioral terms for a 
program modifies the instructional procedures or changes the amount 
of student learning that takes place. If, from the poir ^ of view of the 
educator, the most relevant considerations are the decisions that will 
be made as a consequence of the information reporting, then it will 
be of utmost concern to determine the impact of describing objec- 
tives in behavioral terms. There is little evidence to substantiate that 
such descriptions and the available data relating to them modify 
the nature (rf the subsequent judgments by decision makers. 

A study by Eva Baker^ attempted to contrast the effect that be- 
havioral and non-behavioral objectives have on pupil learning and 
found no significant differences in items directly measuring the ob- 
jectiyes or in the transfer items. However, this study dealt with 
modification of student outputs as a function of using behavioral ob- 
jectives rather than the impact of such use on decision makers. In a 
study in which adult students are the decision makers, Blaney and 
McKie^ attempted to determine whether knowledge of instructional 
objectives in an adult education program assists participants in at- 
taining^ objectives. The hypothesis is that the group that was 

• Dr. Stephen P. Kldn, Executive OflScer, Elementary School Evaluation Prof- 
ect. Center for the Study of Evaluation, Univenity of CaUfomia, Los Angeles. 
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given behaviorally stated objectives would do significantly better 
than the control group was confirmed. However, in a typical educa- 
tional situation, one woxdd think of the teacher or another interme- 
diate control agent as the appropriate decision maker rather than 
the student. A study presendy underway by Dr. Eva Baker at the 
UCLA Research and Development Center will attempt to examine 
the use of student response data (that is, information related to the 
achievement of student behavioral objectives) in relation to t'.e sub- 
sequent revisions of the instructional material made by the decision 
maker. 

Intuitive feeling, however, would lead to the view that, all things 
being equal, it is probably better to specify objectives than to not do 
so at all. With this in mind, and with a deep conviction at the UCLA 
Center for the Study of Evaluation that the specification of system ob- 
jectives should be the function of a local decision maker rather than 
of an external body, the Center is developing a system to help the 
decision maker determine such selections. 

In an attempt to provide local decision makers with behavioral 
objectives and appropriate test items, we have established an In- 
structional Objectives Exchange at die Center. The Exchange is 
under the direction of Rodney W. Skager* and James Pophamf and 
has been established in response to several problems presendy ex- 
istent in the field. These are: 

1. The role of the teacher/decision maker as an objectives selector, 
rather than as an objectives generator 

2. The need for test items related to objectives 

3. The imminent duplication of eflforts in various parts of the United 
States. 

While the Instructional Objectives Exchange project will fimction 
as a clearing-house in the area of objectives and items, our prime 
intended use of the Exchange at the Center goes beyond this. We 
plan to use some of the material collected in tiie Exchange in order 
to study the form and use of behavioral objectives. For example, we 
want to answer the following questions: 

1. Do alternative modes of stating objectives have a relationship 
to pupil performance? 

• Professor Rodney W* Skager, Graduate School of Education, University of 
California, Los Angeles, California. 

t Professor W. James Popham, Graduate School of Education, University of 
Cdifomia, Los Angeles, Caufoinia. 
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2. Does using behavioral objectives as the basis for determining 
information requirements modify the nature of the ultimate 
judgments of decision makers? 

3. What are the types of decisions made by teachers, administra- 
tors and others who have been presented with objective-based 
data? 

We hope that the results of these studies will provide some in- 
sights into the relevance of behavioral objectives as a part of evalu- 
ation of relatively well defined instructional programs, particularly 
in the program development and program certification stages. 

A RESPONSE 

The activities of the UCLA Center of Study for Evaluation are 
vitally related to the evaluaticm problems faced by schools and school 
districts every day. We regard ourselves as a research and develop- 
ment imit whose goal is to make a difference in education. Our activi- 
ties in conceptualizing evaluation are designed, among other tilings, 
to enable us to understand the potential relevance of various proce- 
dures in evaluations of different types. Our School Evaluation Project 
will hopefully provide insights into the information requirements of 
decision makers. Oiur activities related to the Instructional Objectives 
Exchange and Measurement System project will provide evidence 
as to the form and use of program objectives in decision making in 
the improvement and certification stages of evaluation. 

The kind of mapping of the domain that is exemplified by the 
three activities named will ultimately allow us to answer in some 
definitive way whether the need for specification of objectives in 
evaluation is relevant or irrelevant. 
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Approaches to 

the Validation of Learning Hierarchies 

MARGARET C- WANG* 

Several independent lines of investigation over the past decade have 
been focussing on problems of the temporal oxdet in whidi cognitive 
behaviors are acquired* Developmental psychologists, particularly 
those exploring the implications of Piaget's theories of cognitive de- 
velopment, have been interested in demonstrating the existence of 
regular sequences in the acquisition of concepts and logical opm- 
tions. At the same time, test and measurement specialiste interested 
in "criterion-referenced testing have recognized that test batteries 
based on reliably established acquisition sequences might offer a 
means of economically estimating performance on a variety of spe- 
cific behaviors from a relatively small number of test items* Finally, 
curriculum and instructional designers have been interested in iden- 
tifying optimal sequences for teaching new skills and concepte* Al- 
though tiiese three groups have ratiier different goals, their concern 
with sequence in the acquisition of behavior has given them a com- 
mon interest in the twin problems of generating and validating 
havioral hierarchies**— that is, sets of behaviors which can be shown 
to be acquired in an invariant sequence, implying that later beha- 
viors are dependent upon, or in SOTie sense "built out oP earlier ones. 

•This paper was cc-authored by Lauren B. ResDldc and Ktoaret C Wan|. 
The research reported herein was supported by the Petsonnd and Training Branch 
of the Office of Naval Research by a grant from the ftoject Follow Through of the 
U.S. Office of Education and by the Learning Research and Develoj^ncnt Cen- 
ter supported as a lesearch and development c^ter by funds from the U.S. 
Office of Education, Department of Health, Education and Welfare. The opin- 
ions expressed in this publication do not necessarity reflect the position or policy 
of the Office of Education and no official endorsement should be inf^red* 
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The developmental psychologist's intarest in hierarchies derives 
largely from a concern for verifying the ^dstence of invariant stages 
in development, through which all children pass. Hierarchical ^ tage* 
theories of development have been proposed by many developn^ntai 
theorists, of whom the most frequently cited widi respect to c^^gni- 
tive development is Piaget**^ Such theories essentially predict the 
(nrder in whidb certain behaviors (conc^ts, intellective and nlso 
physical sidlls) will appear* They do not necessarily imply a ''m^itur- 
ational" as opposed to learning,"" or organism-environment infrac- 
tion theory of how such changes occur.^ 

TABLE 1 
Cross Sectional Study Analysis 
Per Cent of Conservation Responses for Ma^, Weicht, and VoLin^ 
AT Successive Ace Levels (N = 25 at Each Ace Level)^ 
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Most studies of developmental sequence have employed cross- 
sectional designs in which samples of several ages are tested on a 
set of behaviors* An empirical sequence can ihsa be derived from 
the percentages of children able to perform the tasks at various ages. 
An example of data from a cross-sectional study appears in Table 1. 
The study, by Elldnd,^ examined the ages at which conservation of 
mass, wei^t and volume were acquired. Note that the percentage 
of children conserving mass, mounte sharpfy at age 7; the same rise 
in percentage takes place at age 9 for wei^t; r^nd not at all (up to the 
age of 11) icT volume. These data show a dear order of difficulty 
among the three tasks and they suggest the hypothesis that each in- 
dividual diild acquires conservation of mass first, then weight and 
finally volume* 

A cross-sectional study, however, cannot directly test the hypothe- 
sis that the order of acquisition is invariant for each individual; le*, 
that the behaviors are hierarchically organized* Longitudinal studies, 
in whidi an initial sample of children are re-examined over a period 
of years, would permit the testing of hi^irchical sequences. How- 
ever, longitudinal studies are extremely difficult and costiy to mount 



16 MARGARET C. WAXG 



Despite general recognition of their value to developmental psy- 
chology, relatively few such studies of intellectual development have 
actually been conducted.* 

A few psychologists have seen in scalogram analysis, originally de- 
veloped by Guttman'* as a niediod of scaling responses to attitude 
questionnaires, a technique that could combine the power of longi- 
tudinal studies to examine intra-individual sequence contingencies 
with the speed and lower cost of cross-sectional studies.^^ These 
methods have been applied to sequences of behaviors in the areas of 
haptic perception, logical judgments, moral judgments,-" number con- 
cepts,^* and classification skills.^^ 



TABLE 2 
A Perfect Gutiman Scale 
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Like cross-sectional studies, scalogram studies require the admin- 
istration of a battery of tests presumed to sample behaviors at vari- 
ous points in a linear hierarchy to a group of subjects. Aldiough the 
age of subjects may vary, age itself is not the ind^ndent variable 
in scalogram studies. Instead, scores on the test battery are exam- 
ined for "scalability**— the extent which the tests can be arranged in 
an order such that passing a certain test reliably predicts passage of 
all tests lower in the scale.* Table 2 shows a hypothetical set of per- 
fectly scaled data. Subjects are listed down the side, tests across the 
top. Note that once a subject fails a test i^dT indicates failure), he 

* One example of a bnritudinal study of intellectual development is Piaget^s 
study of his own three diildren reported in 'The OriiZins ot Intelligence dt 
Children."2S & 

* The tenn "test*' is used here and throughout this paper to denote a collec- 
tion of individual items which are prestuned to measure the same behavior and 
for whidi a sin^ "pass" or "fair* score can be assigned. Thus, "tests" are 
treated in this researdi the way "items** were treated in Guttman's original work. 
An "objective," as used here, is a description of ^e behavior sampled in a test. 
It represents an intended outcome of instruction. 
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fails all subsequent tests. The existence of such a **perfect" scale» or 
an acceptable approximation to it» is taken to confirm the existence 
of a behavior hierarchy. While the sequence of acquisition is not 
observed directly, it is inferred from the fact that individuals who 
can perform higher level behaviors show evidence of having also 
learned, or otherwise acquired, all lower level behaviors. The lower 
level behaviors, in other words, appear to be prerequisites for the 
higher level ones. 

Educational test designers have become interested in scalogram 
analysis primi rily as a means of constructing test batteries for diag- 
nostic or "placement* purposes.^***^^ In such testing, the aim is to 
determine in which specific parts of a curriculum an individual needs 
instruction rather than to ^^ssess a general "level" of performance or 
to OMnpare individuals or groups. For this purpose, it is often neces- 
sary to test large numbers of specific behavioral objectives. This can 
be an exceedingly complex and time-consuming procedure. 

The existence of empirically validated hierarchies can permit sub- 
stantial economy in placement testing, since subjects who pass a 
test at the top of a hierarchy can be assumed to be capable of pass- 
ing all lower level tests. Thus, by testing the top objectives in a num- 
ber of hierarchies, a student s general "entering level" can be quickly 
assessed. Subjects who fail the top-level tests in a given hierarchy 
can then 1^ tested for the lower level objects to determine specific 
instruction needs. 

To learning psychologists and curriculum designers, hierarchies 
represent a means of sequencing learning tasks in such a way as to 
maximize transfer bom one task to another in order to facilitate the 
leimiing of successively more complex behaviors. This means that the 
requirement of predicting passage of tests lower in the hierarchy is 
subordinated to the requirement of generating hierarchies in which 
training on one task has a predictable effect on learning tasks higher 
in the hierarchy. These two requirements — prediction downward and 
learning facilitation upwaTi-— are closely related. Hmvever, diey are 
not necessarily a)mpletel% correlated. It is theoretically possible for 
objectives to scale perfectly, but it is also possible for instruction in 
a task lower on the scale not to produce significant amounts of trans- 
fer to higher level objectives. On the other hand, it may be possible 
to construct highly efficient instructional sequences which introduce 
objectives without having first established all prerequisite behaviors 
specified in a scale. Researchers interested in the use of hierarchies 
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as a means of sequencing instructional objectives, therefore, are nec- 
essarily concerned that hierarchy validation studies seek to establish 
independently the scaling properties of hierarchies and their learning 
transier properties. The extent to which transfer and scaling relation- 
ships coincide can then become a matter for empirical investigation. 

Gagne*® was the first to formally propose the use of learning hier- 
archies in designing educational programs, although various methods 
of ''task analysis,** leading to hierarchy-like structiu^s, had been used 
in developing industrial and military training programs for some 
time.^ Gagn^ has outlined a procedure by which behaviors can be 
analyzed by asking the single question, *What kind of capability 
would an individual have to possess to be able to perform diis task 
successfully, were we to give him only instructions?" One or more 
subordinate tasks are specified in response to this question. The ques- 
tion is then applied to the subordinate tasks themselves, and so on 
successively down the hierarchy until tasks that can be reasonably 
assumed in the student population are reached. In our own work we 
have been developing rather more formal methods of generating 
hierarchies.^® Our method is based on an analysis of skilled perform- 
ance that has certain features in common with the technique of "pro- 
tocol analysis'* developed by Newell^ in connection with information 
processing and computer-simulation studies. We also insist on a rig- 
orous specification of stimulus and response in our task definitions, 
which has the effect of keeping each of our tasks more "unitary" than 
most of Gagn^ s. Operationally, this means that fewer test items 
would be needed to sample each task in our hierarchies than in 
Gagn^s. 

Figiure 1 is an example of one of our hypothesized learning hier- 
archies. Each box defines a task. The entry above the line defines the 
stimulus situation; the entry below the line, the response. The simpler 
behaviors, according to our analysis, appear at the bottom of the 
diart; the more complex behaviors toward the top. Note that- fliis 
hierarchy, like most of Gagn^ s, is non-linear. For example, behavior 
E is considered prerequisite both to G and F, and H is shown as 
having two prerequisites, C and F. For instructional purposes, se- 
quences, ABC and DEF could be taught simultaneously, or either 
one might come fiist; but both would have to be learned before H 
could be acquired. This branching characteristic permits us to rec- 
ognize withia a hierarchical framework much of the variety and 
complexity that .;haracterizes learning patterns. For this reason, we 
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believe that hierarchies of this kind more accurately reflect psycho- 
logical reality than do the linear hierarchies mainly used by devel- 
opmental psychologists^^'*® and by testers.^ However, a branching 
hierarchy poses certain knotty problems in validation methodology. 

FIGURE 1 
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These are the problems to which much o( our current work in hier- 
archies is addressed, and to a discussion o( which we now turn. 
Our fii^t validation studies were concerned with the ''scaling*' prop- 
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erties of a set of hierarchies in the area of early quantification skills. 
Figure 1 represents one of the hierarchies studied. A battery of cri- 
terion-referenced tests* was developed,^^ one for each of flie objec- 
tives included in the hierarchies. The battery ^1 tests was adminis- 
tered to a random sample of kindergarten children in September, 
1968, before any formal instruction in the curriculiun was given. 
' he results of ' ese tests were then analyzed for scaling properties. 

Our first analyses repiosented an attempt to adapt existing linear 
scaling procedures to the validation of branching hierarchies. For 
this purpose we used the Multiple Scalogram Analysis, a procedure 
developed by Lingoes.^ This procedure was selected for several rea- 
sons. First, it can not only validate or refute a hypothesized sequence 
but can also suggest a more optimum sequence or set of sequences. It 
also provides multi-dimensional information about the tests in a given 
scale. When the data demand it, it can yield multiple scales rather than 
rejecting the scale hypothesis for the set treated as a whole. With re- 
spect to .statistical reUability, MSA contains a measure to control for 
spuriously high estimates of "reproducibility" — Guttmans classical 
measure of scalability. This is an important feature of the program, 
since the possibility of inflated reproducibility indices, due to extreme 
pass or fail rates on certain tests in the battery, has been one of the 
major criticisms of Guttman s method in the past.23-^»^«»35,3,4,22 finally, 
a computer program has been d'^veloped for MSA — the Format Free 
MultiScaling Program (SCALE); th-^refore, MSA is an economical 
and convenient procedure to use, especially when dealing with large 
sets of data. 

Although the MSA program is capable of picking out multiple 
scales, these scales are independent of one another, naving no ob- 
jectives in common. Once an objective is selected for inclusion in a 
scale, it is no longer considered for membership in other scales. For 
example, with respect to Figure 1, if objective H were to scale with 
C, B, and A it could not appear in a scale with F, E, and D in the 
same analysis. Therefore, in order to apply the program to validate 
a branching hierarchy, it was necessary to test separately each rf 
the linear pathways implied by the hierarchy. For the hierarchy 



• "Criterion-referenced test" is an achievement test developed to assess the 
presence or absence of a specific criterion behavior described in an Instructional 
objective. Such a test provides infonnation about the competence of a student 
that is independent of the perfomiance or other students. For further discussion 
of criterion-rcferenced tests sec Glaser.^^ 
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shown in Figure 1 we ran five separate analyses: ABCH1H2IK;- 
ABCH.HsIJJz; DEFHiH^IK; DEFH1H2 J1J2; and DEC. 

The input data for the analyses consisted of a pass or fail score for 
each subject on each test. The index of the degree to which the ob- 
jectives are sequenced is operationally defined as the reproducibility 
criterion for Guttman scales: 

Sum of errors 

Rep. = 1 

Total Responses 

Error is defined as a case where a subject passes a higher level ob- 
jective and fails a lower objective. In this study, the criterion of 
reproducibility was set at .85. This meant that only those tests that 
could enter a scale with a reproducibility equal to or greater than 
.85 were included in the scale. 

The results of these analyses are shown in Table 3. For each analy- 
sis the first column shows the hypothesized scale and the second 
coltmin shows the empirical scale generated by MSA. Analysis 1 
shows that K and I (counting ordered and unordered arrays of ob- 
jects) had been placed too high in the hypothesized sequence. These 
counting tasks, according to the data, should come before tasks in- 
volving numerals (B,C,Hi,H2). The basic sequence with respect to 
learning nimierals (A, then B, then C, then H), however, was con- 
firmed. Matching numerals (A) appeared as prerequisite to counting, 
but this may have been an artifact of the very high rate of passing 
test A. Where nearly all subjects in a sample can perform a behavior, 
scaling may show it as prerequisite even to imrelated behaviors. 
Analysis 3 tests the sequence of all counting objectives (D,E,F,I and 
K) and suggests that counting fixed arrays (K and I) comes before 
cc^unting out a subset from a larger set (F). Even counting out a set 
(F), however, should come before using numerals (Hi and H2), ac- 
cording to this analysis. In combination. Analyses 1 and 3 suggest 
that our initial hierarchy introduced numerals too early in the count- 
ing sequence. The implication — ^not directly tested in these analy- 
ses — is that counting of various kinds must be established before 
numeral recognition can be learned. Analyses 2 and 4 support this 
interpretation, and also suggest a reordering of the sub-objectives in 
Hand J. 

On the basis of these analyses, it was possible to construct a new 
learning hierarchy, rearranging the original objectives. This hier- 
archy is shown in Figure 2. The five objectives involving counting of 
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FIGURE 2 
Reordered Hierarchy for Quantification 
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objects (D,E,K,I,F) are now in a linear order, with numeral identifi- 
cation (B) appearing as an upward branch from I. Visual matching 
of numerals (A) is shown as prerequisite only to numeral identifica- 
tion and reading (B and C) because, despite its apparent relation- 
ship to K and I in the empirical scales, it did not seem reasonable to 
expect that learning visual matdiing of numerals would help in learn- 
ing to count. H and J sub-objectives appear in the new order sug- 
gested by the analyses. This order seems quite reasonable since boA 
Hi and Ji involve counting a set (of objects or events) in response to 
a symbolic presentation, and both H2 and J2 involve selecting sym- 
bols to match sets. Counting claps (G) is retained as a separate 
branch. As with all post-hoc interpretations, of course, it will be 
necessary to test this reordered hierarchy using new samples of sub- 
jects before accepting its validity. 

TABLE 4 

Comparison of Hypothesized and EMPiaicAii Scale 

FOR COUNTINC OBJECTS AND COMPARISON OF SSTS 

(N=37) 



Empirical Scot* 

HyFoHMtlzcd Sc«!* Scot* 1 S^iTI Sad* 3 



Q I D (Rote count 0-5) ID VII B VII D 

E (Count moveable objects 0-5) II D ME 

F (Count out a set 0-5) IE VII F 

Q II D (Rote count 6-10) IF VII C 

E (Count moveable objects 6-10) VII E II F 

F (Count out a set 6-10) 
QVII B (Pair sets — equal, unequal) 

C (Pair sets — more, less) 

D (Pair sets — most, least) 

E (Count sets — equal, unequal) 

F (Count sets — more, less) 

*G (Count sets — most, least) 

Reproducibility .950 .886 1.000 

• EMminoted from coniiderotion becauie all S'* failed. 

In this first application, Multiple Scalogram Analysis proved us- 
able, although awkward in requiring so many separate analyses. Our 
next attempt to apply MSA, however, was to reveal more serious 
complications. Table 4 shows the results of an attempt to test die 
hierarchical relations between counting skills (Q I and Q n) and 
two methods of comparing set size, (a) by one-to-one correspondence 
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(Q VII B,C,D) and (b) by counting each set (Q VII E,F,G). Our hy- 
pothesis in this case was a linear one. We predicted that children 
would firet learn to count five objects (I,D,E,F), then ten objects 
(IID,E,F); and that they would then learn to compare sets, first by 
one-to-one correspondence (VII B,C,D) and then by counting (VII 
E,F,G). The empirical analysis yielded three independent linear 
scales. Scale 1 includes all of the objectives for counting to five, in 
the predicted order, but also suggests that children learn rote count- 
ing to ten (II D) before they learn to count five objects. One objective 
for comparing by counting (VII E) falls into this scale. However, the 
objectives for counting objects to ten (II E and F) do not. Instead they 
appear in Scale 2 along with comparing by one-to-one correspond- 
ence (VII B and C) and the otrier comparing by counting objective 
(VII F). One objective (VII D) did not fall into either scale and ap- 
pears by itself as Scale 3. 

There are several diflBculties in interpreting these results. Some 
difficulties derive from MSA's restriction to independent linear scales. 
For example, it is unlikely that counting objects to ten (II E and F) 
is truly independent of counting to five (I E and F). In MSA, how- 
ever, the tests could not enter Scale 1 unless they also scaled with 
objective VII E. A possible ^lierarchy for these objectives is an up- 
ward branch in which cotiT^ % to five leads both to counting to ten 
and to comparing sets: i.e 
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However, using MSA, this hypothesis could have been tested only 
by running two separate analyses I D, II D, I E, I F, VII E, VII G; 
and I D, II D, I E, I F, 11 E, II F, Similarly, comparing via one-to-one 
correspondence may be prerequisite to comparing via counting, al- 
though not to simple counting. Here a downward branch can be 
proposed in which both one-to-one correspondence and counting 
are prerequisite to comparison by counting. 
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Again, however, this hierarchy is not directly testable under the 
assumptions of MSA. 

Another source of diflBculty in interpretation derives from the use 
of so many separate tests for closely related objectives. Possibly, by 
combining related behaviors we might produce more stable meas- 
ures of the key classes of behavior and thus generate more easily in- 
terpretable s<^es. To explore this possibility, we next combined all 
tests of counting to five and gave a single pass or fail score for the 
set of tests. The same was done for the tests of counting to ten. Simi- 
larly, we computed a single score per subject for all tests covering 
the use of numerals to five and another for the numerals to ten. 
Finally, tests for comparing sets were combined to yield one score 
for the counting method and one score for the one-to-one corre- 
spondence method. These six summary scores were then analyzed 
using Multiple Scalogram Analysis. The results appear in Table 5. 

In this analysis, all of the objectives involving counting fall into a 
single, quite easfly interpreted scale. According to this scale, skill in 
counting objects is acquired before the numerals are learned (I be- 
fore II, and III before IV), but both counting and numerals to five 
are learned before the child learns to count to ten. Comparison of 
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sets by counting is acquired only after basic counting and numera- 
tion are established. Comparison of sets by one-to-one correspon- 
dence (V) appears in this analysis as an independent dass of be- 
haviors, neidier dependent upon nor prerequisite to counting and 
nimieration skills. This finding seems reasonable with respect to 
simple counting and numeration skills (Objectives I -IV). However, 
it seems unlikefy that the two comparison skills (Objectives V and 
VI) are completely unrelated to each other. In die MSA program, 

TABLE 5 

Comparison of the HYPomiEsizED and Empirical Scales 
Basic Number Concept Units 
(N=37) 



Cm0lrfc«t| $<ote 

HrpotlMtix«d Scol* $cfll» i Scoto il 

Objective I (Counting objects 0-5) I V 

Objective II (Using numeral representation 0-5) II 

Objective III (Counting objects 6-10) III 

Objective IV (Using numeral representation 6-10) IV 

Objective V (Comparison of set size by VI 

one to one correspondence) 

Objective VI (Comparison of set size by counting) 

Reproducibility .957 1.000 



once Objective VI was shown to scale with Objectives I through IV 
it could not be considered for membership in a scale with Objective 
V. Although a separate program run for Objectives V and VI alone 
would have been technically possible, the assumptions of the Gutt- 
man scaling procedure make the testing of two-item scales a ques- 
tionable procedure. Thus, there was no acceptable means, within 
the "scalogram" framework, of testing the hypothesis of a conjunctive 
h ^ch in which counting and niuneration to 10 (Objectives III and 

IV) and comparison of sets by one-to-one correspondence (Objective 

V) £.re prerequisite to comparison by counting (Objective VI). 

The repeated awkwardness of Guttmans scaling procedures in 
dealing with branching hierarchies led us to search for an alter- 
native validation method whose assumptions would more closely 
match those of our hierarchical theory. Our requirements were the 
following: 

1. Oui" hierarchies are generated one level at a time, by * .t iden- 
tifying components of the terminal behavior, next ia45;,.uying 
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prerequisites of these components, then prerequisites of the 
prerequisites, and so on in a succession of individual "analyses/* 
This means that the critical relationships in a hierarchy are those 
between vertically adjacent items, (e.g.. Figure 1, benveen F and 
H, E and F, C and H, E and G, etc.) rather than across an entire 
scale. Thus, it was appropriate to seek a method of validation 
that tested these adjacent relationships directly and did not im- 
mediately seek to construct multi-test scales or summary statis- 
tics covering an entire hierarchy. 
2. The validation method should provide a means of testing sev- 
eral kinds of branches. These include (a) upward branches, in 
which a single objective is prerequisite to two or more higher 
level objectives (e.g., in Figure 1, E is prerequisite to both F 
and G); (b) downward conjunctive branches in which several 
objectives are jointly prerequisite to a single higher level one 
(e.g., in Figure 1, F and C must both be learned before H can 
be learned); (c) downward disjunctive branches in which either 
of several objectives is a prerequisite to a higher level one. Fig- 
ure 3 shows a downward disjunctive branch. The hierarchy 
hypothesizes that in order to compare the number of objects in 
two rows (C) the child can either count the sets (A) or use a 
method of one-to-one correspondence (B). He need not, how- 
ever, be able to perform both A and B. 



2 sets of objects 

Count sets and state 
which has aore (less) 



nCURE 3 
A DisjuNcnvE Branch 



2 rows of objects 
(not paired) 



state which row has 
nore (less) regard*- 
less of length 



OR 



1 



2 sets of objects 



Fair. objects ana 
state which set has 
wore (less) 



3. The method selected should^ ideally, permit a process of "search" 
among objectives for hierarchical relationships not previously 
hypotihesized. These would in effect provide hypotheses for sub- 
sequent studies. While this is not a flieoretical requirement, the 
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possibility of such searclies would be a valuable tool during the 
early stages of research in a new area. This capability will of 
coturse require a computerized analysis capable of handling 
large quantities of data and of considering many alternative 
relationships. 

Other investigators have used procedures that met the first of 
these requirements. Gagne's various hierachy studies*®**"***^ used pass 
and fail contingencies for adjacent objectives in a hierarchy to com- 
pute a "proportion of positive transfer" statistic — essentially the in- 
verse of the percentage of cases in which an individual passes a 
higher level test while failing the lower level "prerequisite." Walbes- 
serV^ proposed method for validating the AAAS science cuixiculum 
also usrs pass-fail contingencies to test the "dependency" of each 
individual objective on its immediate prerequisite. Both Gagne and 
Walbesser directly test downward conjunctive hypotheses by com- 
bining data for two or more prerequisite tests and assigning a "pass" 
score only if all tests are passed. Upward branches are not tested 
directly, but are in effect implied when each of two higher-level ob- 
jectives is shown to have the same lower-level objective as its prere- 
quisite. However, neither Gagne nor Walbesser has discussed meth- 
ods of testing downward disjunctive branches. Finally, neither of 
these methods is appropriate for empirical construction of hierarchies 
from test data, as opposed to validation of deductively analyzed 
hierarchies. 

Dr. John Carroll, of ETS in Princeton has developed a hierarchy 
validation procedure that meets the requirements outlined in para- 
graphs 1 and 2, and wliich will also be, once a computer program is 
completed, quite economical to apply to large quantities of data, 
thus permitting empirical search for hierarchical relationships.^ Car- 
rolls method, like those of Gagn^ and Walbesser, begins with the 
construction of pass-fail contingency tables for all possible pairs of 
items in the hierarchy. Phi/Phimax* coefficients are then computed 
for each table. When the coefficient rea jhes an acceptable level, a 
hierarchical relationship between the two items is inferred, with the 
test showing the higher pass rate considered prerequisite to the one 

• "Phi" is essentially an estimate of the correlation between two tests, each 
scored dichotomously. Fhimax is an estimate of the highest-possible phi coeffi- 
cient given the marginals of the contingency table. Since phimax wouM become 
larger as the pass or fail rate of either test hecame more extreme, die use of 
phimax in the denominator essentially controls against artificial inflation <k the 
association due to extreme pass or fail rates. 
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FIGURE 4 

Hierarchy of Countinc and Set Comparison Skills 
According to Carroll's Analysis 
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with the lower pass rate. On the basis of these simple prerequisite 
relationships, it is possible to construct a hierarchy which can have 
both linear and branching sections. 

Figure 4 shows a hierarchy derived from applying Carroll's pro- 
gram to the data analyzed in Table 4, The hierarchy contains both 
upward branches and downward conjunctive branches. Each of these 
types of branches can be logically derived from the simple prerequi- 
site relationships.* Downward disjunctive branches, however, must 
be tested directly. The Carroll program will do this by combining 
two tests and giving them a pass score if either of the two tests was 
passed. Phi/Phimax coeflBcients will then be computed for these new 
sc(H«s. Since the computer program for disjunctive contingencies has 
not yet been completed^ and hand calculation is extremely tedious, 
we have not yet applied this analysis to our data. However, we be- 
lieve that the study of alternate routes to learning objectives — the 
essence of the disjunctive hypothesis — may be one important means 
of accounting for individual differences witliin a hierarchical frame- 
work. 

The hierarchy in Figure 4 shows many branches, with very short 
linear paths. It is in some respects easier to interpret than the scales 
shown in Table 4. Essentially, the hierarchy breaks up Scale 1 of 
Table 4, showing rote counting to ten (II D) as not prerequisite to 
counting objects to five (I E and F), but as dependent upon rote 
counting to five (I D). This is precisely what would be expected from 
a behavioral and logical analysis of counting skills. On the other 
hand, the hierarchy also shows the five tests of Scale 2 as being 
unrelated to one smother. This result is not so easy to interpret; 
behavioral analyses would have predicted that VII C would remain 
dependent upon VII B, and II F on II E. Further testing using new 
subject samples and, where necessary, revised tests, wHl be needed 
both to clarify the substantive issues raised here and to further ex- 
plore the characteristics of Carrolls validation method. 

In the research discussed up to this point, attention has focused 
exclusively on the possibility of predicting lower level beha\iors 
from performance on higher level ones; no attempt has been made 

* Direct testing of downward conjunctive branches is not logically necessary. 
If a test is independently dependent on each of two other tests, then it cannot 
logically be passed unless each of its prerequisites is passed. Neverthele.^, Car- 
rou is planning to include an empiri(^ check on this deduction by combining 
two or more tests to yield a single pass or fail score and then computing 
phi/phimax coefficients for the combined scores. 
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in these studies to directly study the effects of learning lower level, 
presumably prerequisite, skilk on the learning of higher level be- 
haviors. To study these transfer effects, experiments involving in- 
struction m the elements of the hierarchy are required. Such experi- 
ments by directly inducing acquisition of certain behaviors, permit 
more direct tests of transfer hypotheses. 

Gagne^^ reported an exploratory study in which ability to per- 
form a terminal task, given verbal directions only and no "practice," 
was measured before and after completion of a hierarchically ar- 
ranged teaching program which stopped short of the terminal objec- 
tive. This study in effect measured transfer to the terminal task 
from all of the subordinate learning sets combined. Other studies by 
Gagne,^^'*^ as well as a more recent study by ^ord and Meyer,® use 
a combination of instruction and scale analysis to test transfer among 
the subordinate sets themselves. 

In each of these studies subjects worked through a teaching pro- 
gram designed to teach each of the behaviors in the hierarchy. Al- 
diough the programs were designed to teach with a minimum of 
errors, demonstrated mastery of one unit was not required in order 
to move to the next unit. Thus it was possible to "complete" the pro- 
gram without mastering all of the behaviors taught. Upon comple- 
tion of the program subjects were tested on mastery of each separate 
behavior in the hierarchy.. The data were examined to determine the 
percentige of subjects able to perform each behavior who were not 
also able to perform the predicted prerequisites for that behavior— 
in effect for scaling "errors." A low rate of such errors indicated that 
mastery of the ''prerequisite" was needed in order to profit from di- 
rect instruction in the higher-level objective and thus confirmed the 
hierarchical hypotheses. 

A study by MerrilF'* introduced a mastery criterion into the teach- 
ing program itself as a means of testing the transfer characteristics 
of a hierarchy. Some subjects were given correction and review on 
successive tasks within a program until they reached a criterion of 
mastery; other subjects continued through the program regardless of 
master)' of the successive tasks. Merrill assumed, in accord with 
hierarchical theory, that mastery of lower level tasks would produce 
faster, more accurate learning and better retention of higher level 
tasks. He thus predicted that the correction and review group would 
go through the program more quickly and would perform better on 
immediate and delayed post-tests than the other group. These pre- 
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dictions were not borne out, and Merrill concluded that mastery of 
tasks lower in a hierarchy is not essential to learning a higher level 
task. It should be pointed out, however, that the hierarchy on which 
Merrills teaching program was based had not been independently 
validated. Thus, Merrill s results may simply mean that the particu- 
lar hierarchy studied is invalid rather than that hierarchically or- 
dered sequences in general do not produce positive transfer. 

All of the studies just described have attempted to study transfer 
properties of an entire hierarchy, and each has used a fairly extensive 
teaching program as its instructional vehicle. An alternative strategy 
is to study transfer relationships between adjacent pairs of behaviors 
in a hierarchy or among short sequences of behaviors. This strategy, 
while requiring many more separate studies than the total hierarchy 
approach, permits much tighter experimental design. In addition, as 
Gagne*^ has pointed out, it puts hierarchy research in contact with a 
past body oP]^sych<dDgical research in transfer variables. A number 
of experimental designs for such small-scale transfer studies are 
possible. 

One such design is to teach several behaviors in each of several 
diflFerent orders to diflFerent groups of subjects and to take repeated 
measurements of achievement of all behaviors during the course of 
instruction. Uprichard^ used this approach in studying various se- 
quences of instruction for the basic matiiematical concepts of "great- 
er than" (G) "less than" (L) and "equivalent to" (E). Six groups of 
niursery school chUdren received small group instruction in these 
three concepts, each group learning the concepts in a different se- 
quence. A test covering all three concepts was administered at the 
end of each week of instruction. When three out of the four subjects 
in a group reached criterion on the concept being taught, the entire 
group moved on to the next concept in its sequence. The week-by- 
week test scores on each concept for each of the groups provided the 
basic data in this study. Only the groups who were taught E first 
reached criterion on a concept in the first week of instruction. The 
groups beginning with G and L reached criterion on E in the tfiird 
or fourth week of instruction without ever being taught the concept 
directly. The groups beginning with L learned only E in four weeks 
of instruction and had not learned L when the experiment ended. 
Thus, the data make it clear that E is the easiest to learn of the 
three concepts and L the hardest. The group taught in the order 
E-G-L was the first to reach criterion on all three concepts (in the 
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fourth week), thus suggesting that this is the optimal order for teach- 
ing the three concepts. However, the data is not absolutely clear in 
this respect, since the G-E-L group reached criterion on both G and 
£ in the third week at a time when the E~G-L group had still ac- 
quired only E. 

A more sensitive measure of learning is available when subjects 
are run individually; trials to criterion or error rates on each task in 
the learning situation itself can then be used as the dependent 
variable. Assume that two behaviv,.* are taught in two orders, A-B 
and B-A, to two groups of subjects. According to hierarchical theory, 
if B is dependent on A then trials to criterion for task B in the order 
A-B should be significantly lower than for the same task in order 
B-A* An additional implication is that in order B-A, A should be 
leamed"* virtually without error in the formal presentation, since the 
subject must somehow have teamed A on his own in order to have 
acquired B. Finally, the total number of trials for tasks A and B com- 
bined should be lower in A-B than in B-A ord^r, since the former 
would be a more efficient order in which to teach the set of tasks. 

A recently completed experiment by Resnick, Siegel and Kresh^^ 
used this design in a study of double-classification skiUs in young 
children. Two tasks were used. Both required the child to correctly 
place objects in the cells of a matrix. In task A the defining attribute 
for each row and column was "given** to the child in the form of a filled 
"attribute** or "edge** cell. In task B, there were no attribute cells and 
the subject had to infer the defining attribute from filled interior 
cells in the matrix. A typical matrix for each task appears in Figure 
5. We hypothesized that task B was dependent upon task A. In ac- 
cord widi the predictions just outlined, '^.ur results showed signifi- 
cantly more trials to criterion for task B when it came first than when 
it was preceded by task A. In addition, the predicted "immediate*' 
learning of task A in second place did occtur for subjects who had 
succeeded in learning B. However, the number of trials to criterion 
for the two tasks combined was not significantly diiferent for the two 
orders. 

Members of our staff are now designing several other transfer ex- 
periments which will be run over the next several months. We view 
such studies us a means not only of ordering specific behaviors, but 
also of exploring the relations between hierarchical sequences and 
actual teaching procedure. For example, we intend to explore the 
conditions under which practice on a terminal behavior may be 
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FIGURE 5 
Sample Matrix Tasics 
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more eflScient than learning a hierarchical set of subordinate beha- 
viors. We will also want to ask, as we have begun in the study just 
reported, what eflFect practice on the terminal behavior has on learn- 
ing subordinate behaviors. Eventually, as the parameters of transfer 
in learning hierarchies become clearer, we hope it will be possible to 
define individual differences in learning as a function the ways in 
which hierarchical structures are acquired. Some individuals, for ex- 
ample, may be able to skip over certain behaviors in a hierarchy 
while others may need explicit instruction at every step. Similarly, 
some may need extensive practice, to the point of "overleaming,** be- 
fore a newly learned behavior facilitates learning of a higher level ob- 
jective, while others may show transfer effects from brief exposure. 

With respect to applied work in curriculum design and evalua- 
tion, our work will continue to be concerned with defining and sharp- 
ening the role of hierarchical analysis, and in particular with deter- 
mining the extent to which scalability of tests accurately predicts 
transfer relations among the behaviors. To explore this question, it 
will be necessary to conduct both psychometric studies, in which 
batteries of tests are administered and examined for hierarchical re- 
lationships, and experimental training studies, in which the behaviors 
in question are taught and transfer effects evaluated. By conducting 
both types of studies on each major hierarchy investigated, we ex- 
pect to be able to examine empirically the extent to which scaling 
properties of hierarchies have direct implications for teaching se- 
quences. We will also be able to explore tfie extent to which varying 
teaching sequences can produce differing scale structures. As these 
relationships become clearer, behavior analysis and learning hier- 
archies can be expected to become increasingly more valuable tools 
in educational research and development. 

REFERENCES 

1. Canoll, J. B., Personal communication, March, 1969. 

2. Cox, R. C. and Graham, G. T. Hie development of a sequentially scaled 
adiievement test. Journal of Educational Measurementy 1966, 3, 147-150. 

3. Edwards, A, L. On Guttman's scale analysis. Educational and Psychologjicd 
Measurement Journal VIIL 1948, 313-318. 

4. Edwards, A. L. Techniques of attitude scale construction. New York: Apple- 
ton-Century-Crofts, 1957. 

5. EUdnd, D. Children's discovery of the conservation of mass, weight, and 
volume: Piaget replication study 11. Journal of Genetic Psychologyy 1961, 
98y 219-227. 

6. Ferguson, R. L. The development, implementation, and evaluation of a com- 
puter-assisted branched test for a program of ndividually prescribed in- 
struction. Unpublished Doctoral dissertation. University of Pittsburgh, 1969. 



LEARNING HIERARCHIES 37 



7. Festinger, L. The treatment of quantitative data by "scale analysis." Psy- 
chological Bulletin, 1949, 44, 146-161. 

8. FlaveTl, J. The developmental psychology of Jean Piaget. New Yo^h, Van 
Nostrand, 1963. 

9. Ford, T. D., Jr. and Meyer, J. K. A test of Gagne*s hypothesis of hierarchical 
subtasks for developing learning program and an alternate proposal. Paper 
presented at Western Psychological meeting, April 28, 1966, at Long Beach, 
California. 

10. Gagne, R. M. The acquisition of knowledge. Psychobgicd Review, 1962, 
69, 355-365. 

11. Gagne, R. M. Curriculum research in the promotion of learning. Perspectives 
of Curriculum Evaluation, 1967. 

12. Gagne, R. M. Learning hierarchies. Educational Psychologist, 1968, 6, (1). 

13. Gagn6, R. M. and Paradise, N. E. Abilities and leanung sets in knowledge 
acquisition. Psychological Monographs, 1961, 75 (Whole No. 518). 

14. Gagne, R. M., Mayor, J.. R., Garstens, H. L. and Paradise, N., E. Factors in 
acquiring knowledge of ma^ematical task. Psychological Monographs, 1962, 
76,(\\1ioleNo.518). 

15. Glaser, R. Instructional technology and the measurement of learning out- 
comes. American Psychologist, 1963, 18, 519-521. 

16. Green, B. F. A method of scalogram analysis using summary statistics. Psy- 
chometricka, 1956, 21, 7^8. 

17. Guttman, L. A basis for scaling quantitative data. American Sociolo0cal Re- 
view IX, 1944, 139-150. 

18. Holland, J. G. Research on programming variables. Teaching machines and 
programmed learning, II data and direction. R. Glaser, Department of Audio 
Visual Instruction, National Education Association of the United States, 1965. 

19. Kofsky, Ellen A scalogram study of classificatory develojmwnt. Child Devel- 
opment, 1963, 191-204. 

20. Kohlberg, L. Early education: A cognitive developmental view. Chrld De- 
velopment, 1968, 39, 1013-1062. 

21. Kropp, R. P., Stoker, H. S. and Bashaw, W. L. The construction and valida- 
tion of tests of the cognitive processes as described in the Taxonomy of Edu- 
cational Objectives, U.S. Office of Education, Cooperative Research Project 
2117 U.S. Department of Health, Education and Welfare, Institute of Hu- 
man Learning, Department of Educational Research and Testing, Florida 
State University, Tallahassee, Florida, February, 1966* 

22. Lingoes, J. C. Multiple scalogram analysis: A set theoretic model for ana-* 
lyzing dichotomous items. Educational and Psychological Measurement, 
XXm, (No. 3), 1963. 

23. Loevinger, Jane A systematic approach to the construction and evaluation of 
tests of ability. Psyctwlogical Monograph^ 1947, 61, (Whole No. 4). 

24. Merrill, D. Correction and review on successive parts in learning a hierarchi- 
cal task. Journal of Educational Psychology, 1965, 14, (5), 225-234. 

25. Miller, R. B. Analysis and speciiication of behavior for training. Glaser, R. 
(ed.) Training Research in Education. New York: John Wiley, 1965. 

26. Newell, A. On the analysis of human problem solving protocols* In J. C* 
Gardin and B. Jaulin (eds.) Cakul et formalisation dans les sciences de 
Vhomme, Presses Universitaires de Frances, 1968, 146-185. 

27. Peel, E. A. Experimental examination of some of Paiget^s schemata concern- 
ing children's perception and thinking and discussion of their educational 
sigiiificance. British Journal rf Psychology, 1959, 24, 

28. Piaget, J. The origins of intelligence in children. New York: International 
University Press, 1952. 



38 MARGARET C. WANG 

29. Resnick. L. B. Design of an early leanung cumciJunj WorWng Paper #16, 
Learning Research and Development Center, Umversjhr of Pittsburgh, 1968. 

30. Resnick, L. B. Behavior analysis and the generation of leanung hierarchies. 

31. Re^X?°B:, Siegel, A., and Kresh, E. The sequence of acquisition of 
matrix classification skills. In preparation. ■, . i ^ 

32 Spiker, C. C. The concept of development: Relevant and irrelevant issues. 
In H. Stevenon (ed.) Concept of developmettt: A report of a conference com- 
memorating the fortieth anniversary of the itfitute o/ chdd development, 
Vr^ersitu of Minnesota. Monographs of the Society for Research in Child 
Development, 1966, 31, {No. 5) (Serial No. 107 , 40-54. 

33 Uprichard, A. E. An experimental study designed to deteraune the most 
efficient learning sequence of three set relations in the pre-school years. Un- 
published Doctoral dissertation. University of Syracuse, 1969. 

34 \Valbesser, H. H. A hierarchically based test battery for assessing saenbfic 

■ inquiry. Paper P^^'^d at the American Educational Research Association 

35 wS^, B^wl^and Saltz, E. Measurement of reproducibility. Psychologicd 

■ BuMctin, 1957, 54, (2), 81-99. r i, u analv 

36. Wohlwill, D. F. A study of the development of number by scalogram analy- 
' sis Journal of Genetic Psychology, 1960, 97, 345-377. ... 

37. Wang, Margaret C. (ed.) Crtterion-Referenced Tests ^r the early les^nmg cur- 
riculin of &e pnmary education project. University of Pittsburgh, 1968. 



Some Problems with Regard to Research 
and Development in Higher Education 

LELAND L. MEDSKEB 



Research in education in general is big business and the segment of 
higher education shares fdly in the enterprise. Total appropriations 
for "Research and Training** by the Congress alone increased ten- 
fold to a total of over a hundred million dollars in the period 1957- 
1969. If grants by foundations and other donors were considered, the 
amount would be considerably greater. In 1957 relatively few major 
research projects pertaining to higher education were in progress. 
The Inventory of Current Research on Higher Educationr-1968, a 
project sponsored jointly by the Carnegie Commission on the Future 
of Higher Education and the Center for Research and Development 
in Higher Education, University of California, Berkeley, listed nearly 
1,000 projects and had contacts with more than 2,000 researchers.^ 
Since the InverUory listed only research underway and not reported, 
it does not portray the total volimie of inquiry and findings that an- 
nually bear on education beyond the secondary school. 

The rapid increase in research in higher education is reflected in 
many forms and types of activities. There is research within colleges 
about themselves — in other words, institutional research— which has 
become nearly universal in practice. Then there are the himdreds of 
individual researchers engaged in inquiry on various facets of edu- 
cation at the college level. Recent years have witnessed the develop- 
ment of organized research units, usually but not always based in 
universities, which systematically attack major problems in the field. 
More will be said later about the research and development center 
concept which was initiated by the United States OflSce of Education 
in 1964 and which developed from a concern that the fragmentary 
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nature of most educational research did not lead to su£Bciently cumu- 
lative findings for influencing change. The alternative was to estab- 
lish a limited number of centers in which programmatic research on 
crucial problem areas would be conducted. To date nine such centers 
have been funded by OE, although the Berkeley unit is the only one 
focused entirely on higher education. Inherent in the center idea is 
the concern for translation of research findings into practice, hence 
the emphasis on D or development, about which more is said later. 

The prevalence of research in higher education is further reflected 
by the extent to which the Educational Research Information Center 
(ERIC) system has expanded as a means of storing and disseminating 
findings. An ERIC Center for higher education was established at 
George Washington University in 1968. It is supplemented by other 
Centers— such as the one at UCLA on junior college research data, 
and one at the University of Michigan concerned with research find- 
ings on student personnel. 

Thus at the end of the decade of the 1960's there is evidence of 
widespread research and development efforts in higher education. 
The need for research in the field increases annually and the prob- 
lems connected with it become ever more complex. The remainder 
of this paper will attempt to deal with some of these problems and 
their implications. 

As background for the problems let us turn first to the changing 
scene in higher education — a scene which naturally reflects a total 
society that is itself far different today than it was only a few years 
ago and whose future is destined to still further radical change. 
Higher education is a vast enterprise. It requires the support of 2 per 
cent of the gross national product and involves about 4 per cent of 
the population, including more than a third of the college age gioup. 
With the goal of near universal education through at least the junior 
college years the growth spiral continues upward. And so does the 
cost spiral! The financial plight of the private colleges and the mani- 
fest resistance on the part of the public for the support of public 
institutions raises serious questions as to how the enterprise is to be 
sustained But other issues loom with equal intensity, some alarming* 
ly, others auspiciously. They are familiar to all: Student rejection of 
the status quo; confrontation; violence; questions concerning the 
education of various ethnic groups; cleavages between faculty, ad- 
ministration, students, and governing boards, and disenchantment 
on the part of the public. Efforts to meet these problems are many 
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and varied They include curricular innovations, new configurations 
of governance, greater emphasis on institutional and state planning, 
and often sheer compromise as a means of maintaining peace on the 
campus. 

This is the context in which research in higher education must take 
place and it makes certain problems immediately apparent. One such 
problem is the increasing variability of the research variables. Noth- 
ing is static. The nature of student input changes constantly. The 
purposes and goals of higher education seem to shift from one period 
to another and are perceived differently by various participants. 
Changes over the period of a longitudinal study which covers the 
college years may invalidate any controls of variables or may signifi- 
cantly alter the nature of the study. A special problem is encoimtered 
in those projects which attempt to assess the impact of college on a 
graduate's performance in life activities by the fact that societal sit- 
uations a few years following college may be entirely different from 
those for which the college experience was designed. In fact, a ques- 
tion could be raised as to whether some of the earlier studies such as 
those conducted at Vassar would be valid in the present climate of 
change. The matters that are of primary concern about students today 
are different from those which engaged us yesterday. Until recently 
a study of college students might have included an assessment of 
their attitudes about sex and liquor whereas today these issues may 
seer* pallid when compared vriih drugs and violence. Likewise, any 
earlier study of the decision-making process in colleges and universi- 
ties that would have wrestled with the problem of authority between 
faculty and administration must now be concerned with student in- 
volvement in a variety of forms. 

A second problem is what one might term the shifting relevance 
and significance of issues for research. During the last few years— 
indeed the last few months— developments in the nations colleges 
and universities have raised serious problems which were not promi- 
nent theretofore. Some of them are so grave as to question the very 
survival of our social institutions and most of them suggest new pri- 
orities for investigation. Take, for example, the question of how dif- 
ferent ethnic groups are to be served Just as colleges seemingly were 
finally about to t^e steps toward integration, it now develops that 
separate departments, if indeed not separate institutions, are to 
emerge as the way of serving these groups. This comment implies no 
value judgment as to how what is probably the most important, diflS- 
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cult, and belated task in higher education is to be performed but it 
does suggest that new ways and forms of serving these students 
should be a research topic of high order. A similar question relates 
to admission policies pertaining to students who by the usual criteria 
do not meet tiie entrance requirements of selective institutions. What 
is the impact on both the students and Ae institution when the usual 
standards are waived? What new criteria are needed for evaluating 
prospective students from culturally different backgrounds? Con- 
sider other emerging issues such as the impact of confrontation and 
compromise on the institution or the matter of student involvement 
in decision-making, whether by forceful demand or peaceful assimi- 
lation, and at once it is clear that the results need assessment by 
means other than mere guesses. In the same vein one can consider 
such questions as the following: the effect of federal financial aid to 
students (effects on both the students and the institutions), problems 
associated with what seems destined to be a greatly expanded system 
of non-baccalaureate institutions (new types of vocational schools as 
well as conununity colleges), emerging governance and plamiing 
configurations, innovative efforts to reorganize the undergraduate 
curriculum, and new concepts of graduate education. The list is 
grossly incomplete, but it takes little imagination to realize that un- 
less sudi new issues become the concern of both researchers and 
practitioners, Rome is in danger of burning whUe many people fiddle. 

It is evident that an increasing amoimt of research must be of a 
kind that is useful to decision makers, planners at local and state 
levels, and legislative bodies, including Congress. The ivory tower 
research concept is due to decline, if it ever existed. The day for 
identifying and attacking the crucial problems and of leading toward 
solutions is at hand. This, however, is the diflScult way to plan 
research. 

Still another problem that complicates research today is the matter 
of constraints which are imposed by the public in an effort to avoid 
the invasion of privacy. For some time, any agency conducting re- 
search under a federal grant has had to submit any instruments to 
be used in gathering data to the Office of Education for approval. 
While our own experience with OE has been positive in that its staff 
has responded quickly to our many requests for reviews and in gen- 
eral has b3en liberal in its approval of controversial items, it is never- 
theless a process which requires a great amount of time and plan- 
ning. But the same caution is now extended into other situations with 
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even greater constraints. For example, in California it is now illegal 
to administer certain types of instruments to high school students 
without prior parental consent. Such a restriction has imposed great 
difficulty for the Center's SCOPE* project under the direction of 
Dale Tillery in which he is to follow up some 9,800 high school stu- 
dents in the state this spring. True, t}ie requirement at the moment 
applies only to students below the college level, but two matters are 
of concern. In the first place, one never knows when similar legal re- 
strictions may be extended to the college level. And secondly, many 
studies in higher education involve contacts with secondary school 
students and thus the restrictive policies apply automatically. 

Still another restriction arises out of those policies and laws which 
prohibit the identification of students as members of ethnic groups. 
While one can appreciate the rational behind such restrictions, it 
could be argued tliat they often tend to preclude the very research 
from which the groups are most likely to profit. 

It is true, of course, that some of these are fundamental problems 
characteristic of all educational research and that both researchers 
and people in the field have a responsibility for seeking a balance 
between the right of privacy and the advancement of social research. 
As Dr. Tillery said in 1966^ at a symposium of the National Council 
on Measurement in Education in New York on this subject: 

In summing up, scientists should have the right to study human phe- 
nomena but also the responsibility to seek the cooperation of indi- 
viduals and institutions in a manner which clearly respects the right 
of privacy and the protection of anonymity. This forces the investi- 
gator into very car^Ful plans for seeking counsel and understanding 
of individuals and groups associated with his enterprise. It means the 
willingness anc'. the ability to communicate the importance of the re- 
search and the basic rationale for the methods and techniques being 
used. If we are not willing to involve practitioners in our work, par- 
ticularly those who carry the weight of responsibility for decision- 
making in a time of great social stress, we may be forced to restrict 
the kinds of investigations which we may conduct. 

The problems discussed so far tend to fall into the category of 
externally derived constraints. They are imposed on the researcher 
by the very nature of the current environment and he has little choice 
other than to cope with them. There is another category which tends 
to stem from within the research world — though naturally this too is 

• School to College: Opportunities for Postsecondaiy Education 
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affected in part by developments outside. In this group can be in- 
cluded the complex factors that arise out of the many and varied re- 
search and data processing methodologies and the current emphasis 
on the interdisciplinary approach. The range of acceptable research 
techniques is far greater today than it was a mere few years ago. 
Moreover, the age of the computer and its aflSliates now make it pos- 
sible to initiate projects of a magnitude that could not have been 
conceived a decade ago. Under these circumstances hard decisions 
have to be made as to the most appropriate and feasible project di- 
mensions as weU as the techniques to be used. The decision as to 
whether to use the micro or the macro approach to a problem is no 
longer based on the question of whether the latter is possible so 
much as it is on a consideration of which will be the more appropriate. 

The idea of interdisciplinary research is now popular and for good 
reason since it is agreed that the background and approach of re- 
searchers in various fields, particularly in the behavioral sciences, 
should be brought to bear on problems in higher education. But the 
task of organizing a team of researchers representing several disci- 
plines is easier said than done. Often the individuals either do not 
wish to cooperate on the same project or they do not have the tem- 
perament to do so. A representative from a discipline may engage in 
research on a given educational problem and may even confer from 
time to time with his peers in other disciplines, but this alone is not 
interdisciplinary research. Naturally, there are many examples of 
teams from various fields which are successfully mounted, but they 
are the exception and the process is diflBcult. 

If the foregoing identification of certain diflBcult problems of re- 
search in higher education can be accepted, the problems should now 
be viewed for their implications to individuals and units engaged in 
research. Let us first examine their relevance to individuals who are 
attached to colleges and universities and who are involved in con- 
ducting research either by themselves or as members of an institu- 
tional research unit. A number of considerations come to mind. In 
the first place, everyone has the problem of determining the relative 
significance of potential research projects and of attacking those 
which seem to be most in need of study in Ught of today's societal 
perplexities. Anyone can keep busy doing research of interest to him 
and undoubtedly to others, but since there is neither enough money 
nor talent to attack all the most serious problems, some value judg- 
ments have to be made concerning those to which the resources 
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should be allocated. It would appear that too many bureaus of insti- 
tutional research engage in various types of "head counting" which 
yield little more than "nioe-to-know" information and really do not 
make much impact on the institution. 

A second guideline is that projects undertaken must be manage- 
able. With present research technology there is the temptation for 
researchers to bite off more than they can chew. This is true of 
graduate students as they undertake dissertation studies and it is a 
disease to which we are all susceptible. Often research results will 
be more meaningful if they are pinpointed and if the findings are 
reported quickly and without undue complexity. 

Another possibility is for representatives of groups of institutions 
to organize themselves into consortia arrangements so that they can 
assemble comparable data from across the institutions represented 
and thus have broader bases for comparisons and generalizations. In 
a sense this idea is in opposition to the preceding one that projects 
should often be small. However, in view of the data processing capa- 
bilities available today, it is possible to have both large-scale and 
small-scale projects with each filling a particular need. Another pos- 
sibility for participation in large-scale projects which still involve 
one s own institution is through cooperation with organized research 
agencies on specific projects thereby gaining the benefit of compar- 
able data from many institutions and tihe assistance of the staff of the 
large research agency which is often geared to tlie macro approach. 

Perhaps the most important implication of all is that data coming 
with comparative ease from many projects need to be carefully inter- 
preted for their implications for individual institutions. The sheer 
quantity of data sometimes leads to hasty generalizations or some- 
times to none at all. The opportunity for individual researchers or 
for those in bureaus of institutional research to compare their own 
findings with those of others and to postulate further implications is 
great and the individual researcher who fails to take the comparative 
stance is forfeiting an opportunity to make his own research of 
greater significance. 

Naturally, many implications can be drawn for large-scale organ- 
ized units based in universities or elsewhere. To some extent, the 
implications for these units are similar to those pertaining to individ- 
uals or iV-jtitutional research bureaus, but in other ways they differ 
considerably simply because of the nature and size of most organized 
units. These centers, for example, have the same problem of deter- 
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mining the relative significance of issues in need of research and thus 
of setting priorities for themselves in terms of their program. As 
Norman Boyan^ said in a major address at the American Educational 
Research Association conference in Los Angeles: 

What we know now as never before is that we must also make sense 
out of the following questions: What are your substantive priorities? 
WTiat significant problems are you ttying to solve? How do you pro- 
pose to allocate your resources to solve these problems? What time- 
frame is necessary for solving these problems? What evidence will 
you accept that you are moving toward solution of these problems? 

On the other hand, because of their overall capacity the centers 
inherit the even more fundamental problem of determining the bal- 
ance between basic research and other types of inquiry.. It would be 
entirely appropriate for a unit to undertake an exceedingly basic 
study on the learning process at the college level, but whether to do 
so may depend upon the press for solutions to other current problems 
in higher education, solutions that suggest a greater emphasis on 
applied or poLcy-oriented research. The criteria for making such a 
determination will vary with the objectives of the research center and 
the sources of its funding. 

Obviously, it would be lamentable if in their zeal to deal only with 
the problems of the day, research centers Were to omit entirely any 
consideration of contributing to knowledge through research on smie 
of the fundamental problems of educating people. On the other hand, 
if the crucial problems confronting Ae colleges today are to be 
solved in the light of rationality there should certainly be some input 
from research and there is a good question concerning who will make 
this input if organized centers do not. It is true, of course, that in 
some areas basic research and other tj'pes of inquiry are not mutually 
exclusive and thus basic research does not preclude investigation 
leading to policy determination. 

Tlie organized center also has the opportunity and probably the 
responsibility to conduct its research on a programmatic basis so that 
one step tends to follow another and also so that the total effort in 
researching a problem area is coordinated, even if several projects 
are involved. 

The organized centers also face all the problems inherent in the 
interdisciplinary approach. Because of their size and the fact that 
they often reside within universities, the use of an interdisciplinary 
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team ostensibly is relatively easy for tiiem; on the other hand, their 
utilization of this process is subject to all the problems and limita- 
tions referred to earlier. 

Perhaps the most crucial problem of all in an organized center, and 
particularly one funded as a research and development agency, is 
the relationship between research and development. As a matter of 
fact, there is a prior question of just how development should be de- 
fined. Generally speaking, it is presimied to be those efforts which 
effect change in education, particularly by the use of research find- 
ings. While research and evaluation may stem from educational prac- 
tice in the field, the more general notion is that research findings need 
further experimentation and that change should be expected to fol- 
low from research. In any event, the efforts of an R and D center are 
presumably directed toward decision makers and this poses two 
problems. One is that of determining how much of a center s time 
and effort should be devoted to development and the other is how a 
center can best reach and influence decision makers. 

An agency devoted to the study of higher education has a particular 
problem with respect to the latter question in that the decision makers 
in this segment are exceedingly' diverse and dispersed. They include 
faculty, students, administrators, members of governing boards, state- 
wide coordination agencies, and legislative bodies. Nonetlieless, with 
the mounting problems in higher education an organized research 
unit in this field, whether or not expected to do so by its funding 
agency, is obligated to be concerned with the process of effecting 
change despite all the problems involved. 

Dtuing the last few years some good questions have been raised 
about development and many researchers have been concerned with 
the expectation in certain circles that research will immediately pro- 
duce the means by which education can be revolutionized. Other re- 
searchers have feared that an overemphasis on development would 
militate against their research productivity. In his AERA-PDK* 
Award Lecture at the Annual meeting of the American Educational 
Research Association in 1967 in New York, T. R. McConnell made 
some sound remarks about this problem: 

I should say that an interest in development does not necessarily f»n« 
danger educational research, either basic or applied. It is pressure 
for a Quick pay-off— for an educational cookbook, some more hard- 
ware, for a new and magic educational nostrum— that threatens both 
significant research and sound development. The notion in some quar- 
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ters that it is only a Jump from either basic or applied research to im- 
proved educational practice is much over-simplified. Experience in 
other fields has shown that many processes intervene between re- 
search and production. It also has been demonstrated that evalua- 
tion must accompanv development. The transition from research to 
practice is not one leap. It is a process, a flow, from basic through 
applied investigation, to invention and development, to innovation in 
practice or production, and, finally, to evaluation. Without evalua- 
tion, development may easily become quackery. 

McConnell s statement helps to clear the air about the relationship 
bet\veen R & D, but most organized research units must still struggle 
with the daily problem of how and what to do as a means of bringing 
research findings to bear on practice. 

We might refer briefly to the Berkeley R&D Center as an ex- 
ample of .1 university-based organized research unit that is concerned 
with most of the problems outlined above. The Center was estab- 
lished in 1956 and for nine years was knovra as the Center for the 
Study of Higher Education. During this lime it operated on a rea- 
sonably small-scale basis delving primarily into the problems of stu- 
dent development, but also to some extent into institutional analysis, 
statewide coordination, and related matters. In 1964, when the fed- 
eral government announced its intention to establish a number of 
R&D centers, the Berkeley unit was invited to submit an applica- 
tion for funds to become such a center which it did in September, 
1965. Since then its two major research foci have been (1) contin- 
uation of the earlier, interest on the impact of college on student 
development and (2) college organisation and administration with 
considerable emphasis on planning. Its research program has been 
accompanied by a strong emphasis on development and dissemina- 
tion. The Center has tried to be both programmatic and interdisci- 
plinary in its approach to research, but in doing so has encountered 
the difficulties enumerated earlier. It has now almost completed an 
extensi\e examination of its program and a delineation of its role 
and scope for the next few years. In this process, it has attempted to 
sharpen the focus of both its research and its development program. 
It looks now as if its primary focus will be on how best to extend the 
learning en\dronment and that the approach for this effort will be 
through the examination of new types of educational programs and 
of emerging goveniance configuratioas. 

In the final analysis, of course, it is the users of research who are 
the most important parties of the enterprise. You, who are in the 
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field, bive a heavy responsibility to help make research relevant and 
meani.igful. While not everyone in this audience is engaged directly 
in higher education, many are and many more are by reason of being 
ill secondary schools indirectly concerned. There are various ways by 
which your role can be enhanced. Let me mention three. First, you 
can help identify the problems which, in your judgment, are crucial 
and which need the input from research to help solve them. Once 
you identify the problems they can be communicated to those re- 
search agencies, either within your institution or elsewhere, that 
seem to have tlie greatesf potential interest and capability to attack 
the problems. Second, you may stand ready to cooperate with other 
agencies in major research activities. After all, investigations cannot 
be done in a vacuum, but instead must be carried on in the higher 
education community itself.. Third, you may cooperate in develop- 
ment acti\dties in which further exploration, experimentation, or 
utilization seem necessary to validate research findings. As we move 
from an era in which research findings were relatively passive to one 
in which we believe tliey must be active, the opportunity for engag- 
ing in development activities will be far greater than it has been in 
the past. The future, then, should bring an expanded opportunity for 
researchers and practitioners to woiic as £» team. 
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No one who is in touch with the problems of our troubled society 
can fail to be aware of the criticism that is directed toward our edu- 
cational system. The validity of any specific criticism may be de- 
batedy but there are few who can defend the system as it is now 
functioning. Although we have many reasons to be proud of our 
past accomplishments in education, our future aspirations seem un- 
likely to be attained. 

Educational research, educational development and evaluation 
studies are the major tools by which we can improve education. Un- 
fortunately, confusion exists as to what can best be accomplished by 
each of these tools. The thesis of this paper is that the functions of 
educational research are distinctly different fiom the functions of 
educational development and, further, that evaluation studies in edu- 
cation have yet a third function. Moreover, each of these tools has 
distinct characteristics which need to be made very explicit. A dear 
notion of the characteristics and of the functions of educational re- 
search, development and evaluation will make it possible for us to 
mo /e ahead toward the achievemenl of needed educational reform. 

ACTION 

If one is to perform a purposeful act, he must do three things and 
do them in sequence. Firsts he must formulate an intention, i.e., he 
must visualize a state of affairs which differs from the state of affairs 
he believes now exists, and commit himself to bringing about the new 
situation he visualizes. Second, he must operate upon his environ- 
ment in a way which he calculates will produce the state of affairs 
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he desires. Third, after he has performed the operation he must com- 
pare the state of affairs that he has caused to exist with the state of 
affairs that he had intended to achieve. He then notes whatever dis- 
crepancies remain. These discrepancies serve as feedback to influence 
his next intentions and, thus, guide his further action toward his 
desired end. It does not matter whether a puiposeful activity is a 
simple one, such as finding a pencil upon a cluttered desk or a very 
large and complex one, such as leading a Christian life. In either 
case, the same three basic steps must be taken in sequence, an in- 
tention must be formulated, an operation or set of operations must 
b€: performed and a comparison made at the end to evaluate the 
results. 

The second or operational phase of a complex act is often made 
up of nested sets of subacts. Thus, in carrying out an intention of 
leading a Christian life, an individual, among other subacts, may 
join a church and, within this subact and as a further subdivision of 
this large subact, attend services each Sunday. 

The three phases of purposeful action— intention, operation and 
comparison— correspond closely to the basic functions of educational 
research, educational development and evaluation studies. Educa- 
tional research cati contribute to educational reform by providing 
new knowledge to be mixed with experience (old knowledge) in 
shaping our intentions or (in more usual language) contribute to the 
setting of objectives or goals. Orderly change or reform is not pos- 
sible if one cannot visualize a better state of affairs than the one 
which presently exists* New knowledge generated by successful re- 
search can enlarge the number and the attractiveness of alternatives 
which one may consider* 

Educational development contributes to educational reform by 
providing new and more powerful ways of operating upon the edu- 
cational environment. It creates new educational products and/or 
n*5W human capabilities which, when properly applied can make 
significant changes in educational practice. 

Educational evaluation produces evidence that can sharpen value 
judgments about the present state of affairs in education. Evalua- 
tion, in essence, is simply comparing two states of affairs, one of 
which is considered to be more desirable than the other. Evaluation 
always implies a value judgment 

In a general way, it is suggested that the three major tools avail- 
able for accomplishing educational reform correspond one-for-one 
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with the three basic st^ps of purposeful action. Educational research 
(or considering the findings of research) is to be employed when one 
is in the process of formulating his intentions, tliat is, setting his 
objectives. Educational development provides a manner for operat- 
ing systematically upon these intentions. Educational evaluation pro- 
vides a way of assembling evidence from which one may judge how 
well the state of affairs he intended to bring about has been achieved. 

Let us now examine certain distinctive characteristics of educa- 
tional research, educational development and evaluation studies. 

EDUCATIONAL RESEARCH vs. EDUCATIONAL DEVELOPMENT 

Research and development in education are almost always con- 
founded in the thinking of most persons. R & D has become fused 
and often is regarded as a unitary process. Yet, there are character- 
istics of educational development, both in its purposes and its meth- 
ods, that cleurly differentiate it from educational research. 

Educational development is the systematic process of creating new 
alternatives that contribute to the improvement of educational prac- 
tice. Educational research Is a scientifically disciplined process of 
creating new knowledge relevant to education. The findings of re- 
search in education seldom can be used to improve education without 
doing a considerable amount of additional work. Research outcomes 
are most often reported in technical terms without reference to pos- 
sible practical application. New knowledge usually becomes useful 
only after much transformation, adaptation, and mixing with other 
knowledge which has been gained from experience. 

It seems umiecessary to discuss here the process and methods of ed- 
ucational research. The canons of scientific research in general have 
been made explicit to the point of common knowledge. These prin- 
ciples and methods which apply to educational research as well as 
to any other research are taught as part of most graduate curricula. 
Educational development, however, is not well understood. 

Educational development Ci\n take very different forms since the 
activities that constitute development are quite varied. Ao one process 
has been identified that provides a "blueprint" for educational de- 
velopment, perhaps because at this time too little experience has 
accumulated to form a substantial base frcwn which to judge or eval- 
uate alternate processes. However, two trends that suggest the major 
emerging variants can be seen among the activities of those engaged 
in educational development. These are: The product development 
process and the change support process. 
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THE PRODUCT DEVELOPMEiNT PROCESS 

The product development process seeks to bring about improve- 
ment in educational practice by creating materials, procedures, or 
devices which, when ased ;is directed, are known to yield desirable 
and specified outcomes. The emphasis is upon creating a tested and 
proven "package" with appropriate supporting materials such as 
manuals, of instruction, operator training material, teacher guides, et 
cetera. Thus, the outcomes of a product development process ain 
be descril^ed as "packages of things" that have physical identity. 

A basic assumption of the product development process is that 
scliool personnel will be sufficiently motivated to seek and utilize 
the new and possibly better materials or procedures. A major block 
to l)e overcome in improving educational practice through product 
development is the unavailability of tested and proved educational 
materials. It is assumed that better materials need only be made 
availa!)le in order that improvements in educational practice will 
occur. This assumption is shiired by the old adage, "Build a better 
mousetrap and the world will beat a path to your door." 

It is important to describe in some detail what product develop- 
ment in education means in terms of the tasks involved. Although 
the exact nature of these tiisks and their secjuence of performance 
will \ ary- from p'oduct to product, at least a general pattern of ac- 
ti\ itic.s c<m be ol jrved. 

The first step in such a process is the judicious selection of the 
product to l>e produced. This begins with the awareness of a need 
or problem for which the product might provide a full or partial so- 
lution and involves a very broad specification of the products char- 
acteristics conceived in terms of objectives, costs, feasibility, etc. 

The second general step in the product development process is to 
carefully review the state of the art and knowledge from which the 
product Is to be developed. This includes scrutinizing research liter- 
ature in all revelant areas, assembling valid practical experience, and 
estitnating the costs and diflBculties encountered in bringing to- 
gether the elements essential to the development. 

The third step is invention and design. This entails elaborating 
the product's specifications and fixing upon one or a very few alter- 
native "jnodels" of the product to be created. 

The fourth step in the development process is to prepare a prelim- 
inary version — a "mockup" or "prototype" — of the product and to 
test or examine its performance. This version will only be partially 
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adequate, but will provide information critical to succeeding steps. 

The fifth step is to analyze the preliminary test data, applying di- 
rect attention to their implications for redesign of the product. 

The sixth step is to assemble a revised version of the product, 
which incorporates the experience from the earlier version, again 
subjecting this revised "madef to a performance examination. Steps 
five and six may be repeated any number of cycles before moving to 
step seven, depending upon how successful the design-test-feedback- 
redesign operation has been. Once, however, a model of the product 
is produced which appears to perform to specifications, work pro- 
ceeds to field testing. 

Step seven, field testing, is to design and conduct a rigorous test 
of the product in a situation which duplicates most of the knoun 
relevant characteristics of the operating environment. Specific data 
are gathered about the perfor^.i. ace ot ?he product within differing 
general environments that will yield the '^limits" vwthin which the 
product may be expected to perform. 

The final step is that of operational testing. This differs from field 
testing, in that the group responsible for the development work re- 
tires from direct involvement in this further testing of the product 
This step establishes the feasibility of releasing the product for nor- 
mal operational use without constant supervision by its originator. 
Only after the last hurdle of field testing is the product judged to be 
ready for dissemination* 

THE CHANGE SUPPORT PROCESS 

Educational development that is conducted following the change 
support process directly addresses changing the practice of educa- 
tion. It emphasizes intervention in the behavior of educators. In gen- 
eral, material things are regarded as incidental or clearly subordinate 
to improved attitudes, skills, motives and abilities of people. The be- 
havior to be improved include group or organizational interaction 
of people as well as that of the individual educator. The focus of 
efforts is not limited to individual remediation, but may also include 
rearrangement of relationships among groups, as these in turn affect^ 
the behavior of individuals with n them. A basic assumption of the 
change support process in development is that educatJonal practice 
is improved by direct intervention in what educators do. 

The process involved in educational development through change 
support emphasizes flexibility. Each human situation is different 
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from all others, and each must be met difiFerently. No prescription 
can be written which will be effective in all or a majority of situa- 
tions. As a direct consequence, the persons engaged m change sup- 
port seldom make explicit in advance just what steps they will take 
toward their objectives. From fliis point of view, development is a 
continuing process, never to be completed, since improvement can 
never be said to reach a point where further improvement is not 
possible. Objectives are regarded only as temporary states in a con- 
tinuously changing set of human relationships. 

The activities of an educational developer guided by this proc.ess 
are also characterized by flexibility. One role that has been de- 
scribed explicitly is that of change agent. Such an agent attempts to 
stimulate interest in changing present practices, provides informa- 
tion about what is possible, and encourages those who are attempting 
to change or to make changes in others. In some sense he functions 
as a catalyst in a larger process. Another role for (tie developer using 
this approach is that of a coordinator. He strives to bring together 
persons or agencies where improvements in education might result 
from increased communication, or where the effectiveness of activi- 
ties that are being performed relatively independently could be in- 
creased if they were done in concert. A coordinator s role may involve 
him in negotiation and politics — especially professional politics — and 
may require the ability to manage tfie use of power. Still another role 
is that of trainer. In this role, the developer acts as a super-teacher 
of school personnel, but not of school students. There are a number 
of techniques available to the trainer that can be described as spe- 
cific entities. Among these are role-playing, sensitivity training, T- 
group techniques, psychodrama, etc. 

DEVELOPMENT STRATEGY 

It is not possible at this time to prescribe a best strategy for edu- 
cational development because of our very limited experience with it. 
Most of the strategies being employed today appear to be mixed 
strategies with different emphasis upon one or the other of the two 
approaches described in this paper. The major factor that has influ- 
enced the strategy adopted by those now engaged in development 
work is the background and experience of the individual developers. 
Opinions and beliefs about how education can be improved far out- 
weigh solid evidence based on evaluated experience. Most of the 
persons now engaged in directing educational activities have en- 
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tered their new work from a wide range of previous professional 
occupations and have been trained in a variety of academic disci- 
plines. Generally, they have earned graduate degrees in education, 
psychology, or sociology. Many have backgrounds that include class- 
room teaching and school administration. Because of tlieir widely 
differing backgrounds, it is readily understandable that the differing 
assumptions of the two approaches {product development and change 
support) have different appeal. 

EVALUATION STUDIES 

Evaluation studies are frequently confused with research. It is clear 
that many activities are shared by research on one hand and evalua- 
tion studies on the other, but one cannot be considered a simple sub- 
set of the other. Less confusion exists, however, between evaluation 
studies and educational development. Evaluation for the purpose of 
providing feedback is a subactivity in the development process. In 
terms of the analysis of purposeful acts described earlier, evaluation 
done as a part of development is simply a subset within the larger 
operation. 

Let us return to the task of making explicit the relationship be- 
t\veen research on one hand and evaluation studies on the other. 

Evaluation studies imply comparison and decision about alterna- 
tives; by undertaking an evaluation study, one at once addresses him- 
self to questions of value and utility. It may be objected, however, 
that this is a too idealistic view of the purpose of evaluation studies. 
In fact, the great majority of evaluation studies in education may 
not be concerned widi the alternatives per se, but instead ask the 
simple question, "Does treatment X work?*' At best, there may be an 
implicit assumption that, "if X does not work, we will have to try 
something else," but this is as far as thinking about aliematives may 
go. Nevertheless, and regardless of the lack of precision in thinking, 
providing information for choice among alternatives remain the basic 
purpose of evaluation studies. 

The implications of primacy of utility in evaluation studies and 
the relative unimportance of such a consideration in research are 
profound. Although there are differences in points of view among 
behavioral scientists, an "ideal" research study is characterized by 
most, if not all, of the following* 

• I am indebted to Richard Watkins, Program Coordinator, Far West Region- 
al Laboratory for Educational Research and Development, Berkeley, California, 
for much of this material. 
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1. Problem selection and definition is the responsibility of the in- 
dividual doing the research. 

2. Tentative answers (hypotheses) to the problem may be available 
by deduction from theories or by induction from an organized 
study of knowledge. 

3. Value judgments by the research are limited to those implicit 
in the selection of the problem. 

4. Given the statement of the problem and the hypothesis, the re- 
search can be replicated. 

5* The data to be collected are determined largely by the prob- 
lem and the hypothesis. 

6. Relevant variables can be controlled or manipulated, and system- 
atic effects of other variables can be eliminated by randomization. 

Ali^ost the reverse of all these six statements characterize the eval- 
uation story; 

1. The problem is almost completely determined by the situation 
in which the study is conducted. Many people may be involved 
in its definition and because of its complexity, the problem ini- 
tially is difficult to define. 

2. Precise hypotheses usually cannot be generated. There are many 
gaps where the absence of verified knowledge must be filled 
with judgment and experience. 

3. Value judgments are made explicit in the selection and the defi- 
nition of the problem as well as in the development and imple- 
mentation of the procedures of the study. 

4. The study is unique to a situation and seldom can be replicated 
even approximately. 

5. The data to be collected are heavily influenced, if not deter- 
mined, by feasibility* Choices, where possible, reflect value judg- 
ments of decision makers or those who set policy. Gaps exist 
between data that are feasible to collect and the data that 
would be most useful to the decision maker. 

6. Only superficial control of a multitude of variables important to 
interpretation of results is possible. Randomization to eliminate 
the systematic eflfects of these variables is extremely difficult or 
impractical to accomplish. 

Evaluation studies are not just poorly performed or less rigorous 
research studies. In fact, they can and should be dene with as much 
rigor and imagination as the best of research. However, they differ in 
that they are undertaken in response to a need to know the useful- 
ness of some combination of old and new knowledge which has re- 
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suited in the invention of an alternative to existing modes of action. 
Is a new method of training teachers an improvement over a pres- 
ently used method? Is a specific Head Start program effective in pre- 
paring disadvantaged youngsters to enter school? 

If we accept the proposition that the basic reason for undertaking 
an evaluation study is to develop information that will assist a deci- 
sion maker in choosing rationally among the alternative courses of 
action, then, an evaluation study is to be viewed from a perspective 
quite different than that from which one might view a research 
study. The highly regarded research act, "refuting a null hypothesis," 
carries little or no useful meaning, since for a decision maker to 
know that he cannot reasonably consider some situation or condi- 
tion which is not stated in the hypothesis provides little guidance 
for the choices he must make. Confidence in a conclusion, as repre- 
sented by the research convention implied by the general acceptance 
of the ".05 or .01 probability level" as the criterion for '^belief of a 
research finding, is a luxury a decision maker seldom can afford. 
Rather more frequently he faces situations where any information 
more dependable than that provided by a "flip of the coin" is des- 
perately needed. The concept of "sampling" a domain of problems, 
of which the unique problem the decision maker faces in making a 
particular choice is only one case, is simply not applicable in the 
decision situation, but is the foundation of research design. 

Statistical decision theory provides a proper framework for under- 
standing evaluation studies. Within this framework an evaluation 
study becomes a process of acquiring further information, or new 
information, that can be used by the decision maker. The decision 
maker s probability estimates of the consequences of a contemplated 
act can be modified as a direct result of the outcomes of the con- 
templated evaluation study. His expectation about the outcomes of 
the contemplated evaluation study also have an estimable probabil- 
ity. This fact makes it possible for decision makers to step back a 
step and make a reasonable decision about whether an evaluation 
study would likely be worth what it costs. Thus, the expected value 
of carrying out an evaluation study is determined by the same cri- 
terion that is used to judge the consequences of an action. This 
criterion is not the criterion of the research worker who finds his 
"payofF' in the creation of "new knowledge,** but is the "payoflE" of 
the consequences of action taken by a decision maker. 
The major differences between evaluation studies and research 
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Studies is not the subject of interest or the method of inquiry of the 
researcher and evaluator It is to be found in the manner in which 
the outcomes of the two are used and regarded. 

SUMMARY 

In summary, it has been suggested that research, development, and 
evaluation are the tools we must use if we wish to reform and renew 
educational practice. These tools have an analogous relationship to 
the three steps or phases of any purposeful action: the formulation 
of an intention, the operation upon that intention, and the compari- 
son of the intended states of aflFairs with a realized state of affairs. 
Each of these tools has unique characteristics which fit them to the 
different functions that must be performed if education is to be 
improved. 

If we are to make orderly and rapid progress in improving our 
education system, we who assume some responsibility for the task 
are obliged to know our tools and use them with insight and skill. 



The Challenge of Multi-Agency 
Involvement in Development 

RAY JONGEWARD 

This presentation will employ a five-step planning model as a frame- 
work for the following data and information. It is a simplistic model 
and consists of: 

Step 1. Who hurts? 

Step 2. Why do they hurt? 

Step 3. Who has the aspirin? 

Step 4. Why isn't the aspirin as big as the headaehe? 

Step 5. How can we make a better aspirin? 

First, some background on the Small Schools Program of North- 
west Regional Educational Laboratory. The 27 per cent of the land 
area of the U.S.A. wliich is composed of Alaska, Montana, Idaho, 
Oregon and Washington is 80 per cent rural. With the exceptions of 
urban centers, especially the "strip city" called the Puget-Willamette 
trough which extends from Everett, Washington, to Eugene, Oregon, 
much of tlie Northwest likely will remain nu-al for many years to 
come. Three years ago when the Laboratory was established, the 
report of the Five-State Task Force identified the plight of rural 
schools in the Northwest region as one of four educational priorities. 
NWREL s Small Schools Program addresses itself to one of these. 

Four crucial needs were cited by the report as having special rele- 
vancy to rural education in the Northwest. They were: 

1. The lack of adequately trained teaching personnel, usually char- 
acterized by a high txunover rate, and the lack of inservice 
opportunities 

2. A narrow and frequently out-of-date ciuriculum offering 

3. The low aspiration level of rural students 
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4. The economic and cultural deprivation existent in these geo- 
graphically isolated environments ("rural" and "poverty" are 
nearly synonymous) 
NWREL has developed several activities focused on these prob- 
lems of isolated schools. For example the Intercultural program }is 
been working with the creation, development and testing of reader 
for Alaskan natives and Indians. Built upon the graphoneme con- 
cept, they are being field tested in seventeen villages in Alaska. This 
testing involves the Bureau of Indian Affairs, the Indian Tribal Coun- 
cils, the State Department of Education, the University of Alaska 
anthropologists, linguists and others. Time, however, does not allow 
me to speak in detail on the readers. 

For the purpose of this presentation, one activity has been chosen 
as an example from among the six that comprise the Small Schools 
Program. It should illustrate how NWREL has interpreted its task 
of forming a bridge between research knowledge and classroom prac- 
tice. There are many roles that must be played by Laboratory per- 
sonnel in dealing with each agency or group in an attempt to bring 
about quality changes in the educational establishment. 

The use of this example, however, is noc intended to suggest that 
rural educational problems have been solved in the Northwest. They 
have not! Patterns In Arithmetic (PIA), however, does hold promise 
for making some improvements in the rural environment. The Labo- 
ratory has been using PIA to improve education in rural isolated 
schools. The sequence was developed at the Research and Develop- 
ment Center, University of Wisconsin. 

PIA is a modem elementary mathematics program for grades I-^ 
consisting of videotape TV lessons, teachers' manuals and pupil 
ercise booklets (based on the work of Van Engen).* The 15-minute 
lessons introduce new concepts, review previously covered concepts 
and skills, and provide motivation toward the study and understand- 
ing of arithmetic. Grade I consists of 32 lessons, approximately I per 
week; grade 2 has 48 lessons with 3 lessons every 2 weeks; grades 3-6 
consists of 2 lessons per week. The series attempts to update teachers 
in modem math principles and teaching methods while students are 
being taught basic arithmetic concepts. Original field testing in 1966- 
67 included 9,000 students in Wisconsin and Alabama. This school 
year over 138,000 students throughout the nation are using PIA. 



• Van Engen, Heniy, Assoc. Director, Development 
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(EducpHonal Testing Service participated in this program by devel- 
oping the midterm and final tests.) 

A careful review of PIA by rural educators confirmed tlie ideas of 
NWREL staflF that it offered promise in helping rural elementary 
teachers. In such cases, the Laboratory plays several roles in intro- 
ducing a new approach to regional educators. 

NEGOTIATOR Appropriate arrangements were made with the 
Wisconsin R&D Center and with the National Instructional Televi- 
sion Center to convert the two-inch videotape recorder tapes to one- 
inch reekj-enabling them to be used on less expensive, portable video 
equlpaient. Now, PIA can be used in rural areas vhere educational 
television stations are nonexistent. 

SALESMAN Consistent with the NWREL philosophy of working 
with and tlu-ough existing organizations, approaches were made to 
three State Departments of Education in the Nwthwest region. The 
potential of PIA was explained and assurances were given that its 
use would not infringe upon the State Department s service and/or 
its supervising function. 

CATALYST In addition, the Laboratory felt that obtaining a com- 
mitment from these State Departments of Education to engage in 
testing, monitoring and evaluating new products like PIA would 
facilitate a new role for each of these agencies. 

EDUCATOR Subsequent conferences were successful with key 
State Department personnel in the three states. Agreements were 
reached whereby NWREL would supply the PIA materials and vid- 
eotape equipment. The Laboratory also would provide the evalua- 
tion procedures to be employed. The State Departments agreed to 
select the sites, monitor the action, collect the data and appoint a 
person to coordinate these efforts. 

TEACHER Conferences in each state were actually teaching-learn- 
ing sessions devoted to understanding change processes and evalua- 
tion strategies. For example, some of the factors considered were: 
LThe need to overcome the suspicion of rural teachers, adminis- 
tratt "s and commimity leaders of "outsiders who dress, act and 
talk differentl/* 

2. The wariness which rural female teachers have of modem tech- 
nological equipment, e.g., a survey of 16 schools found 1 movie 
projector 
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3. The need to engage in collecting data and information to eval- 
uate the use of the materials and determine the problems that 
developed % 

4. The roles to be played by State Department personnel, the coun- 
ty superintendent, the local administrator and teachers, e.g., 
over-supervision was a worry 

5. The determination of necessary data and gathering techniques. 

DEMONSTRATOR At this point, what might loosely be called a 
remote control operation began. NWREL demonstrated the use of 
the materials and the equipment to State Department personnel. 
They, in ttrai, demonstrated them to local school boards, adminis- 
trators, teachers and students. Laboratory personnel vwked throu'^h 
the problems of e xpecting data and information regarding the 'ise 
of PIA in the local district and suggested alternative methods of 
evaluation. Forms weie constructed as guidelines for site visitations. 
Interview forms also were devised to obtain the reactions of pupils, 
teachers and administrators. In short, a formative evaluation design 
was developed that met the requirements of both the State Depart- 
ments of Education and the Laboratory. 

FOLLOWUP NVHEL field staflF kept in close contact with State 
Department personnel during the school year. They also visited each 
of die sites as part of the general plan. On May 15, 1969, State De- 
partment of Education persoxmel from the three states reviewed the 
preliminary report and suggested modifications. Later that month, a 
feedback session was scheduled with the authors. Finally, NWREL 
personnel have been preparing the report for the University of Wis- 
consin Center regarding the experiences of using PIA in rural, iso- 
lated elementary schools in the Northwest region. Center personnel 
then must decide the extent to which they can modify the PIA ma- 
terials to more nearly meet the needs of these teachers and pupils. 
Together with the State Departments, the Laboratory is planning 
the work for the 1969-70 school year. Naturally, these plans depend 
upon the modification decisions by the staff of the Wisconsin Re- 
search and Development Center staff. 

PRELIMINARY RESULTS FROM FIELD TRULS 

Geographically, this is a far-flung operation which covers one re- 
mote school in Alaska, two rural schools in Montana and three in 
Idaho. Year-end test results are still being prepared. Preliminary 
data have revealed: 
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1. Slate Department personnel initially committed to the project 
from all three states were all positive concerning their 

a. New role in innovation and change 

b. Attitude toward Patterns in Arithmetic as a useful tool in up- 
grading the skills of rural teachers 

c. Willingness to continue and if possible, expand the program 
next year 

2. From tfie results tabulated thus far, pupil achievement com- 
pared favorably with pilot tests conducted by University of Wis- 
consin evaluators. There, 70 per cent of the 9,000 students per- 
formed above the 50th percentile on standardized achievement 
tests. 

3. Rural isolated teachers with a minimal orientaiion to the ma- 
terials were able to use them successfully. In addition, all of 
them havf^ overcome their early fears of using the videotape 
recorder. 

4. All teachers using Patterns in Arithmetic materials indicated 
their own understanding of mathematics had grown measurably 
during the year. AppreciaHon was expressed for the "whys" of 
arithmetic being so carefully explained by the TV teacher. 

5. In a selected sample, interviews were conducted with first and 
second grade students from d'^sses where PI A was used. They 
revealed eight out of ten chose arithmetic as their favorite sub- 
ject in school as compared to a similar group of nonusers, the 
majority of whom indicated reading was their favorite area of 
.study. 

6. Improvements suggested by rural teachers who used Patterns in 
Arithmetic included: 

a. Keying the Teachers Manual to the video presentations and 
the student worksheets 

b. Identifying the mathematical concepts to be attained in each 
video presentation 

c. Preparing diagnostic tests to enable easier entry into the ma- 
terials by the students 

d. Indicating in the student workbook the skills needed to per- 
form specific lessons satisfactorily 

e. Providing more feedback information to the teachers on stu- 
dent performance following the television presentations and 
the use of the worksheets 

f . Including discussion ideas and helps in the Teacher s Manual 
to expand the specific concepts presented - 
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g. Improving the quality of specific film clips for grades 1 and 2 
Some interesting questions have arisen from this activity. For 
example: 

1. What new role expectancies are raised for the teacher? Patterns 
In Arithmetic presents the basic content; the classroom teacher 
becomes supplementary. Where do teachers and paraprofession- 
ak fit into materials such as PIA? 

2. At what point in the development process is it necessary to in- 
troduce the potential user? Should he enter the process early, 
and how involved in its development should he become? 

3. Should new student materials being developed consciously build 
in the inservice factor for teachers, thus updating their skills? 

4. Will this involvement process result in the establishment of a 
network that may be used successfully in the future fotadap- 
tion/adoption of other innovations? 

Laboratory staff already have engaged in discussions with re- 
searchers at R & D Centers and Universities to seek help, in gaining 
new insights into these and other concerns growing out of this 
experience. 

Before concluding, a few gene»^l remarks seem appropriate re- 
garding the Laboratory setting and the Northwest Regional Educa- 
tional Laboratory specifically. 

As you are aware, laboratories were created to serve as a bridge 
between what is known and what is practiced in education. Their 
aim is to speed the movement of quality improvements within the 
educational establishment. This unique idea demonstrates the prin- 
ciple of creative federalism by returning federal tax fu ids to the 
local level to be used on local needs as determined by local policy 
makers. Each laboratory has the freedom to create its own programs 
and to determine its own strategies for attacking these educational 
problems. 

At the Northwest Regional Educational Laboratory there is a fun- 
damental belief that education can be better— in content, procedures, 
technology and organization. NWREL is clearly an advocate for 
improving education, for innovation and for change. Its primary 
mission is developing new and tested alternatives for educators. Fun- 
damental to its strategy for change, NWREL relies heavily upon 
involvement of the institutions, organizations, associations and indi- 
viduals with whom it works. Products codd be developed more 
quickly if concentrated upon in an environment isolated from the 
setting where they are to be used. But the Laboratory believes 
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change will come about more quickly and will be more permanent 
if ihose involved are active participants in the development process. 

Involvement of potential users/adopters begins with the selection 
of an activity from among available alternatives: it may continue 
through prototype development, field testing, evaluation, demonstra- 
tion and final adoption. A selected activity can enter this process at 
any point cycling back and forth as needed. In reality this becomes 
more of ii circle of development than a straight line progression. 

The NWREL commitment to a developmental strategy demands 
a wide variety of roles be played by its personnel. It is a constant 
concern of Laboratory staff members that no role be assumed which 
properly belongs to others. The Laboratory has been designed to 
serve as an extension or to complement the roles played by other 
agencies or individuals. 
Returning to the five-step planning model initially introduced: 
Step 1. Rural isolated elementary teachers often lack the formal 
training and/or inservice opportunities needed to update skills in 
mathematics. They hurt! 

The students taught by teachers lacking formal training are geo- 
graphically isolated and often economically deprived. Many are 
forced to accept a narrowly oriented curriculum which is often woe- 
fully out of date. The children don't compete. They hurtl 

Step 2. The low economic capability of many rural schools pre- 
vents them from paying adequate salaries to attract well qualified 
teachers. Geographic isolation often prevents conmumity leaders, 
school boards and administrators from realizing the educational of- 
ferings of their school are not adequate to compete with their urban 
and suburban counterparts. The meager fare offered students in a 
stilted environment affects the aspiration level of students. (In one 
community, no child in three generations of one family had com- 
pleted high school) That is why they hurt! 

Step 3. The PIA program offered rural isolated elementary teach- 
ers an inservice opportunity to update their mathematic skills. Stu- 
dents, likewise, were afforded the opportunity to participate in a high 
quality teaching experience and to learn new, modem mathematics, 
thus enhancing their chances of competing. They both received the 
aspirin! 

Step 4. The field trials of PIA in five isolated areas of the N<^- 
west showed deficiencies in the matOTals and the equipment used. 
Mathematics is only one of the many subject areas requiring atten- 
tioa No, the aspirin is not as big as the headache! 
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Step 5. The feedback sessions with Center personnel will give 
Laboratory staff an opportunity to describe and enumerate the dif- 
ferent headaches caused by the adoption of PIA in rural isolated 
schools. Hojpefully, a better aspirin will be developed. At least we 
hopelor one that may reduce die mathematical headaches for rural 
elementary teachers and pupils! 

Before concluding, a few general remarks seem appropriate regard- 
ing the laboratory setting and the Northwest Laboratory specifically. 

As indicated earlier today, regional laboratories are very, very, 
young organizations and will be celebrating their third anniversary 
in June. As struggling new agencies, they have been required to 
mature rapidly. A 25 per cent mortality rate during infancy has been 
high. Perhaps it may in^case. 

As you are aware, laboratories were created to serve as a bridge 
between what is known and what is practiced and to speed quality 
improvements within the educational establishment Each labora- 
tory has had great freedom in creating its own programs and in de- 
termining its own strategies for attacking these educational problems. 

At the Northwest Laboratory there is a fundamental belief that 
education can be better— in content, procedures, technology and 
organization. The Northwest Laboratory is clearly an advocate for im- 
proving education, for innovation and for change. Its primary mission 
is developing new and tested alternatives for educators. Fundamen- 
tal to its strategy for change, Northwest Laboratory relies heavily 
upon the involvement of institutions, organizations, associations and 
individuals with whom it works. Products could be developed more 
quickly if concentrated upon in an environment isolated from the set- 
ting where th^ are to be used. The Laboratory believes that change 
will come about more quickly and will be more permanent if those 
who are the focus of change are active participants in the develop- 
ment process. 

Involvement of potential users and adopters begins with the selec- 
tion of an activity from among ailable alternatives. It may continue 
through prototype development, field testing, evaluation, demonstra- 
tion and final adoption. A selected activity can enter this process at 
any point, cycling back and forth as needed. In reality this becomes 
more of a circle of development than a straight line progression. 

The Laboratory's commitment to a developmental strategy de- 
mands that a wide variety of roles be played by its personnel. The 
example used earlier illustrated some of these roles well; for example, 
advocate, catalyst, risk taker and sharer, teacher, learner, and often 
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patient listener. It is a constant concern of Laboratory staff mem- 
bers that no role be assumed that properly belong to others. The 
Laboratory should serve as an extension or complement to the roles 
played by other agencies or individuals. 

A3 a member of such an organization for the past three years, in- 
teresting challenges have been presented. To mention a few, I have 
enjoyed being actively and intimately involved in here-and-now 
educational problems; constantly searching for new and better al- 
ternatives; and trying them out with buOt-in feedback mechanisms 
that report the selected materials and strategies that result in modi- 
fying future actions. In short, it is a data-based organization. 

I also have enjoyed the possibility of being able to plan sustained 
long-term efforts. Contrary to some people's belief diat we are in 
business for only one year at a time, we believe we are going to 
be around for a long time. We look forward to being able to sustain 
long-term efforts which have the promise of accumulating a critical 
mass of experience, information and data upon specific problems or 
situations. 

The third thing, to be able to marshall knowledgeable experts to 
aid in determining priorities and strategies of attack on chosen prob- 
lems, evaluate these efforts, to redefine or refocus as needed, has also 
been an excitement to me. 

CONCLUSION 

My ptupose in this presentation of one specific NWREL activity 
was to: 

L Demonstrate, by example, how a Laboratory is attempting to 
bridge the gap between research knowledge and educational 
practice. 

2. Illustrate the intricacies involved in working with many agencies 
to improve rural education. 

3. Show the many roles that must be played by Laboratory per- 
sonnel. 

I hope I have also Ijeen able to convey my enthusiasm and belief 
in the laboratory concept as a new means of making educational 
improx'ements. 
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